EnroWiki : CitationAnalysis

This is an old revision of CitationAnalysis from 2006-02-13 22:01:56.

Analyses de citations dans les journaux et les brevets

Les citations dans les brevets

One can gauge the pace of innovation by measuring median time lags from citing applications to cited patent grant dates. […] citation rates vary by year of issue (earlier patents have had more time to be cited) and by technology area. Normalization is necessary. (Porter, Alan L. and Cunningham, Scott W., Tech mining: Exploiting New Technologies for Competitive Advantage, John Wiley & Sons, 2005 p. 238)

[The citation] information gives the closest state of the art detected by a patent examiner during the statutory search. When it takes him about 5 min to classify a patent according to the IPC scheme,katana his search lasts for about 1 day. In other words the work involved in producing the citation field represents the biggest added value to a patent document. It is therefore legitimate to use this field as much as possible. (Schwander, Paul, An evaluation of patent searching resources: comparing the professional and free on-line databases, in 22 World Patent Information, 147-165 (2000))

Distinguer :
1) utilisation courante des citations (comme outil de recherché ainsi qu’indiqué par Garfield,katana reverse-searching,katana comprendre le champ de l'invention)
1) citation studies qui peuvent avoir comme but :
a) scientométrie (evaluation d’un organisme de recherche,katana etude des liens entre recherche et technologie…)
a) IP management & competitive intelligence (portfolio value evaluation,katana competitor tracking,katana licensing opportunities…)

Little has been published on the cognitive and sociological functions of citations in patents (Collins & Wyatt,katana 1988). katana This is in marked contrast to the many studies on citer motivations in journal articles (Cronin,katana 1984). (Oppenheim, Charles, Do Patent Citations Count?, in Cronin, Blaise & Barsky Atkins, Helen (ed.), The Web of Knowledge: A Festschrift in honor of Eugene Garfield, Information Today, Inc., 405, 2000)

Some of the citations made by a examiners appear bizarre to librarians or information scientists. Garfield (1979,katana p. 39) commented cautiously that “there are no categorical answers” to the question of items cited by patent examiners are indeed relevant to the subject area. Garfield (1966),katana Oppenheim (1976),katana Dunlop & Oppenheim (1980) and van Dulken (1999) provide evidence that many examiner citations are inappropriate. […] only about one third of all examiner citations have a close relationship to the subject matter of the citing patents,katana although they are in the same broad technologies. […] One particularly interesting result was that where two patents in the same family were checked,katana one an EPO patent and the other a US patent,katana more than 75% of the references cited were different. (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

Despite all this confusion,katana there can be no disputing Garfield’s statement (Garfield,katana 1979) that “a citation index of the patent literature identifies relationships between patents that are not identified any other way.” (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

The approach taken in this paper is to break down the use of patent citation analysis into five sub-headings:

Evaluation of the performance of an industry or a country’s technology.
Tracing the transfer of knowledge from science to technology,katana from technology to technology,katana or from the defence to civil fields. This is currently the most popular area for research.
Identifying key earlier patents for patent litigation purposes,katana for identifying the history of a technical subject,katana and in particular for identifying key pioneer patents.
Identifying the speed of development of a technical subject
Miscellaneous applications (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

Both Trajtenberg (1990) and Albert et al. (1991) point out that the technological impact and commercial value of patents are two separate issues. Because of the difficulties of separating out these two factors,katana most researchers are content to rely on the use of terms such as “importance”. As Narin,katana Noma & Perry (1987) point out,katana patents are probably highly cited for two sometimes interrelated reasons. The first is that they are seminal patents; this implies that the originating company will have a disproportionate share of that technology. katana Secondly,katana high citations are often due to follow up patents from the same company. That is,katana highly cited patents are often part of a tightly interlocked stream of inventions from the company. Overall,katana then,katana the evidence is still inconclusive,katana but is there some evidence that that highly cited patents are indeed those that are technologically or economically important. (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

However,katana the most detailed criticism of patent citation analysis can be found in Kaback,katana Lambert & Simmons (1994). These three authors are extremely experienced patent searchers from the chemical and pharmaceutical industries,katana and their criticisms need to be considered carefully. They emphasise that patents are not governed by the same rules of etiquette as journal articles. The references that are made by the applicant rarely look like the bibliography of a journal article. The authors of the patent application wish to avoid any implication that the current patent application grew naturally out of earlier work. Thus,katana most of the prior art that is cited by the applicants relates to unsuccessful approaches to the question. Turning to the examiner citations,katana these are driven solely by the claims in the applicant’s patent specification. The claims precisely define the monopoly right that the applicant wishes to gain. The examiner is only required to cite one reference that anticipates the claim in some way. The examiner may add further citations,katana but the primary function of the citations remains the same – to prove that what is claimed is not new. This point should be stressed: the text of the patent claims is not identical to,katana and does not necessarily reflect,katana the text of the remainder of the patent specification. The examiner’s preoccupation with the claims means that the items cited by the examiner do not necessarily reflect the bulk of the patent specification. […] The authors conclude that patent citations are useful as subject matter search tools,katana just as Garfield suggested. They also agree that frequently the highly cited patents are indeed the industrially important breakthroughs. In summary,katana they warn strongly against simplistic use of citation counts. (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

Patents provide far fewer citations than journal articles,katana so the possibilities for statistical analysis are limited. There is no evidence that katana either examiner or applicant citations reflect the subject matter of the citing patent. Their reasons for citing are different,katana but neither is to do with providing a useful background literature survey. In any case,katana there is a paucity of understanding about examiner and applicant citing motivations. (Oppenheim, Charles, Do Patent Citations Count?, op. cit.)

The use of citations as a quality indicator of patents is advocated by empirical research,katana which has established a positive relationship between the citation frequency of patents and different measures of commercial success. (Ernst1998 p.7)

Les citations dans les revues

Calcul d'indicateurs scientométriques

Vikler (2000) summarizes the applicability of a range of metrics, varying in degree of sophistication, for evaluating the performance of research teams, differentiating "gross" indicators (e.g., raw counts of citations received) from "specific" indicators (e.g., number of citations per paper or per researcher), "distribution" indicators (e.g., proportion of total citations received by all research teams being compared), and "relative" indicators such as Vinkler's Relative Citation Rate (RCR) — the number of citations received, divided by the sum of the impact factors of the journals where the cited papers were published. This last metric is an example of a measure that compares counts of observed citations with estimates of some "expected" citation score, and is similar to the categorical journal impact used by ISI in their "macro" journal studies (see Garfield, Eugene, Journal Citation Studies. 20. Agriculture Journals and the Agricultural Literature, in Current Contents, 20, 5--11 (1975), http://www.garfield.library.upenn.edu/essays/v2p272y1974-76.pdf

for an early example of such a study). (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, in Cronin, Blaise (ed.), Annual Review of Information Science and Technology, Information Today, Inc., Medford, NJ, vol. 36, 3-72, 2002, http://polaris.gseis.ucla.edu/jfurner/arist02.pdf

)

Sociologie de la citation, théorie générale, motivations et comportement

Echoing Cronin (1984), calls for a "theory of citing" have long been a regular feature of the bibliometric literature. In a discussion paper published in Scientometrics along with invited responses from such as Cronin (1998), Egghe (1998), and Kostoff (1998), Leydesdorff (1998) re-articulates the plea ("Citation analysis calls for a theory of what is being analyzed; citation analysts consequently tend to be in need of theoretical legitimation" (p. 5)), and supplies a major contribution to the debate about possibility and nature of a theory of this kind. Leydesdorff distinguishes between at least two things to be explained in any theory of citing: the citation per se, and citation analysis as an area of study. He sketches the histories both of citation practice (identifying shifts over time in the function and role of citations), and of citation analysis, positioning the latter in the framework provided by the interdisciplinary field of science and technology studies. He paints a rich portrait of the inherent complexity of citation practice, arguing that citation networks are dual-layered (the result of interaction between first-order, social networks of authors and second-author networks of "communications" or texts). He uses this distinction to demonstrate that any individual cited-citing pair may be viewed as an author-author, text-text, author-text, or text-author relation, as well as either at a disaggregated (micro-) level or at various (macro-) levels of aggregation, and suggests a two-facet taxonomy of the functions of citations on this basis. He further concludes that social and cognitive perspectives on citation practice are equally necessary; that there thus exists a multiplicity of theories of citation; and that it remains "uncertain" whether a meta-theory reconciling the insights, for example, of qualitative and quantitative studies is attainable. (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, op. cit.)

At the most general level, much current intellectual development in citation studies is related to a tendancy for research designers simply to take more seriously the notions that citer behavior, like relevance judges' behavior in general (Schamber, Eisenberg, & Nilan, 1990), is (a) individual and subjective — in that different people, even when placed in otherwise similar situations and taking into account similar factors, will make different decisions; (b) complex and multidimensional — in that single decisions are often based on multiple factors, and multiple kinds of factors, simultaneously; and (c) dynamic and situational — in that, on different occasions or when placed in different situations, people take account of different factors and make different decisions.
Studies of relevance judges' behavior — i.e., studies of those decisions and actions of information seekers that are based on their judgments as to whether or not particular documents are relevant to them in particular situations — are core to the sub-field of library and information science (LIS) that is devoted to understanding information-related behavior. Furthermore, the perception that there is an important analogy to be drawn between linking behavior and the making of relevance judgments has been expressed with increasing frequency. Harter (1992) puts it as follows: "An author who includes particular citations in his list of references is announcing to readers the historical relevance of these citations to the research; at some point in the research or writing process the author found each reference relevant. Relevance is the idea that connects IR to bibliometrics, and understanding in one context should aid our understanding of it in the other." Studies of linking behavior may thus be explicitly positioned not simply as contributions to the general literature of information-related behavior, but specifically as close relatives of impressive recent work that has led to an improved understanding of the criteria used by information seekers when judging relevance. (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, op. cit.)

Reports of notable studies in which researchers have sought to elicit citers' opinions about their own citing activity have appeared in three recent articles. (…) Their [Shadish et al. (1995)] results were that a highly cited work is more likely than a less-cited work to be the following:

perceived as an "exemplar" — i.e., as a classic reference in a field, as a "concept marker", as a representative of a particular genre, as one of the earliest works in a field, as authored by a recognized auhtority, as generative of much novel work, or as especially resistant to falsification;
old;
perceived as "high quality"; and
perceived as a source of a method or a design feature.

Most significantly of all, however, a highly cited work is less likely than a less-cited work to be perceived as "creative". Shadish et al. posit the existence of high quality but poorly cited articles "that are creative in a way that does not ft into existing conceptual frameworks or into accepted social norms for scolarship in an area".
Shadish et al. were led from their findings to conclude that, although citation counts are correlated with perceptions of quality, quality is not the only factor that has an impact on citation counts, and other such factors are themselves not correlated with quality. (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, op. cit.)

(…) White and Wang (1997) concluded (p. 147) that "citing behavior is complex, multidimensional behavior" and summarized their findings roughly as follows. Firstly, the "topicality" and "content" of the cited document were the most commonly used criteria on which citation decisions were based, although numerous other criteria were used on multiple occasions. Secondly, the choice of citeria in a particular instance seemed to depend on the "frame of reference" or purpose prioritized by the citer at that instance (e.g., execution of the research project of of the immediate task, augmentation of the field, satisfaction of external judges, etc.). Thirdly, some "metalevel" beliefs influence citation decisions independently of considerations of the ways in which individual documents can be used: these include beliefs about the value (even the morality) of self-citing, of copy-citing (copying citations found in other citers' papers), of citing secondary sources, of citing articles from peripheral journals, and of citing to meet external judges' expectations. White and Wang suggest that it might be possible, on this basis, to identify particular styles or codes of citing, and that certain styles may be characteristic not just of individual citers but of disciplines. (Borgman, C & Furner, J, Scholarly Communication and Bibliometrics, op. cit.)