Intelligences Journal revue en intelligence économique
[Version française]

Centroid method and centrality parameter: application in strategic watch

José Pino-Díaz, Chiadmi-Garcia, Laila, Ruíz-Baños, Rosario, Bailón-Moreno et Rafael

Abstract

This paper explains the centroid method of application in strategic knowledge mapping. The documentary sets of a research field form techno-scientific networks where the links indicate the association relationships between nodes. The Kamada-Kawai algorithm (KK) draws the graphs of networks such that if they were physical systems formed by nodes connected by forces. The KK algorithm draws the networks with a local state of energy minimum. With this algorithm, the Euclidean distance between nodes is proportional to the geodesic distance. Euclidean distance of a node to the centroid is measure of its centrality. This centrality is the parameter that measures the similarity of the nodes with the study topic of techno-scientific network.
The strategic sub-networks are obtained by removing weak links. Nodes of strategic sub-networks are characterized by two parameters: the normalized centrality and the normalized density (the parameter that measures the average strength of the links of the sub-networks). These two parameters are used in the analysis and strategic mapping of techno-scientific network.
This paper presents the results of applying the method to the Spanish scientific research on protected areas during the period 1981-2005. This type of research has a great utility for decision making in Scientific Policy and Evaluation of Science and Technology.

Keywords

strategic scientific and technological watch, knowledge engineering, information mapping, knowledge mapping, information visualization, decision making, technological and scientific networks, strategic knowledge maps, SK Maps, strategic importance maps, SI Maps

Full text

Introduction

The development in the eighties of the Sociology of Science and Technology (Translation Sociology or Associations Sociology) by Michel Callon (1989) and Bruno Latour (1983); based on conceptual resources, among others, of Michel Serres (Philosophy of Science) and David Bloor (Sociology of Knowledge), have as a result Actor-Network Theory.

The Actor-Network Theory states that in the social construction of a scientific fact involved actors, human and nonhuman. The process of change of the relationships between these actors has as result a network in continuously changing.

The strategic watch of techno-scientific networks involved different disciplines : Knowledge Discovery in Databases (KDD) (Han and Kamber, 2001 ; He, 1999), Engineering Knowledge (Callon, Courtial and Turner, 1991 ; Polanco, 1997) and Information Visualization (Old, 2002).

The clusters of nodes form sub-networks (nodes and links) in techno-scientific networks ; these are identified by the centrality and density parameters. The centrality parameter, or external cohesion index, is a measure indicative of the position of sub-network across the network ; represents the degree of relationship between sub-networks ; can also be interpreted as an measure of similarity with respect to the main network ; a sub-network with high centrality is an sub-network in which the nodes are closer to the centroid, namely, the sub-network has a great affinity with main network. The density index, or parameter of internal cohesion, is measure of internal relations between nodes ; a high density is indicative of strong or enduring relationships that are repeated and prolonged in time.

The strategic watch benefits from the development of knowledge systems (Polanco, 2008). Knowledge systems are useful for competitive intelligence team and technology watch because this technique provide reports, charts, graphs and maps of support decision making (Pino-Díaz et al, 2011).

Objectives

The objective of this research is to develop a method to map techno-scientific networks using graphics analysis and geographic information systems (GIS). The documentary corpus analysis with knowledge systems provides strategic textual information that can be visualized by 2D and 3D maps, useful in decision making.

Materials and Methods

To perform the study has been selected a set by papers on protected natural areas collected from IEDCYT (ICYT, ISOC and IME) Spanish bibliographic databases, published since 1981 to 2005. This set of documents is composed of 942 documents, 3595 keywords, 1542 authors and 223 journals. With the Copalred knowledge system (Bailon-Moreno, 2003) has been created the field KWAJ for each database record ; including keywords (KW), authors (A) and magazines (J). Copalred software uses the co-word analysis (Michelet, 1988 ; Law Bauin, Courtial, and Whittaker, 1988 ; Law and Whittaker, 1992) ; the parameters of analysis selected are been : nodes with minimum of occurrence of five, links between nodes with minimum of co-occurrence of three, and size of sub-networks between two nodes and ten nodes.

The techno-scientific network has been drawn with Pajek (Batagelj & Mrvar, 2010), networks analysis software ; Kawada-Kawai (KK) layout algorithm (1989) has been used to drawing the network, taking as viewing options "lines values are similarities" and "lines of different widths" ; the link value is the equivalence index value (association index) between two connected nodes. The coordinates (x, y) obtained from the nodes of the 2D graph are been utilized in implementing the SK Map.

The centroid of the KK network is the node with average coordinates of 2D graph ; is the barycenter.

Centroid coordinates :

Image1

The nodal centrality (Ca) is the parameter that measures the similarity of the nodes with the centroid ; namely, the similarity of node with the study topic of the network ; its value is obtained of Euclidean distance to the centroid.

Euclidean distance of the node to the centroid :

Image2

Nodal centrality of the node a :

Image3

To detect the most important or significant sub-networks : first, separate the network components, and second, made successive elimination of weak links in order to increase equivalence index value up to get groups or sub-networks with a number of nodes equal or less than 10 (maximum number of nodes of the sub-networks, appointed in this work). In present paper the minimum value of links was 1819 (the index of equivalences acquire values between 0 and 10,000). Finally, the research areas are obtained grouping the next sub-networks.

The z coordinate in the strategic knowledge map represents the parameter which measures the nodal importance in the network ; this parameter is obtained by adding the nodal normalized centrality (Cna) and the nodal normalized density (DnA). The nodal normalized centrality and the nodal normalized density have values between 0 and 100.

Image4

Where Cna is the nodal normalized centrality of the node a ; Ca is the centrality of thenode a ; and Max (Cn)> is tthe mas in oodal normalized centras in the network.

Image5

Where DnA is the nodal normalized density of nodes of the sub-network A ; g is the number of edges in the sub-network A ; i,…, j represents the ntras in the sub-network A ; eij the equivalence indicas in the sub-network A ; l is the number of nodes ; and Max(Ds)> is tthe mas in osub-network density of the sub-networks in the network.

The strategic diagram is made with the ranks values of the centrality (CA) and density (DA) of the sub-networks.

Image6

Where CA is the odal normalized cesub-network A ; Cni is the centrality of thenode i ; and l is the number of nodeslized cesub-network A.

Image7

Where DA is the density of the sub-network A ; DnA is the nodal normalized density of the ntras in the sub-network A.

The sub-network strategic importance is a parameter which represents the relevance of a sub-network in the techno-scientific network ; is obtained by the product of the C constant by the sum of the centrality and density ranks of the sub-network. The C constant value is: C = 1 for sub-networks located in the quadrant 1 of the strategic diagram (right upper quadrant) ; C = 0.75 for the sub-networks located in the quarter 2 (right lower quadrant) ; C = 0.50 for the sub-networks located in the quarter 3 (left upper quadrant) ; and C = 0.25 for sub-networks located in the quarter 4 (left lower quadrant).

The SK Map has been made using GIS ArcView. In de SK Map the reference central point is the centroid and the measure parameters the nodal centrality and the nodal importance. The SK Maps can view in 3D and virtual reality.

The sub-networks position in the strategic diagram is displayed in SK Maps using different symbols and colors for the nodeslizedifferent sub-networks ; is obtained the Strategic Importance Map (SI Maps).

Results

The figure 1 shows the sub-networks clusters (strategic research areas) of the main component.

The strategic diagram is made with the ranks values ​​of the mean nodal normalized centrality and mean nodal normalized density of the sub-networks (Fig. 2).

The sub-network strategic importance parameter is used for made the ranking of sub-networks by strategic importance. The values ​​of strategic importance parameter are assigned to nodeslizesub-networks and three rankings are obtained : SI Keywords ; SI Researchers and SI Journals.

The position of sub-networks in the strategic diagram is displayed in the SK Maps using symbols ; is obtained the Strategic Importance Map (SI Map) (Fig. 3).

Agrandir areas

Figure 1: Strategic research areas

Agrandir diagrama

Figure 2 : Strategic Diagram

Agrandir mapa ivect

Figure 3: SI Map

Finally are been obtained 3D images of SK Maps (Fig. 4).

Agrandir image3D

Figure 4: 3D Images of SK Maps; centrality (centroid similarity) and importance (altitude) of sub-networks

Conclusions

The centroid method of techno-scientific network and the nodal centrality is developed for mapping strategic knowledge. Co-words analysis, network analysis, information visualization (KK graphs, SK Maps and SI Maps) has been used for text mining of a techno-scientific documentary corpus. The technology and methodology of analysis graphs and geographic information systems, GIS, has been used for made SK Maps of techno-scientific networks.

The SK Maps and SI Maps use the landscape visual metaphor in textual information visualization. In the SK Maps the nodal proximity to the centroid is measure of the nodal affinity with the study general topic ; the nodal importance is indicated by its altitude. In the SI Maps the nodal strategic importance is indicated by symbols that indicate the sub-network position in the strategic diagram.

The SK and SI Maps facilitate the decision making in techno-scientific watch.

Bibliography

Bailón-Moreno, R. (2003). Ingeniería del conocimiento y vigilancia tecnológica aplicada a la investigación en el campo de los tensioactivos. Desarrollo de un modelo ciencimétrico unificado. Unpublished Ph. D. thesis, Universidad de Granada, Granada. Spain

Batagelj, V. & Mrvar, A. (2010). Networks/Pajek. Pajek Program for Large Network Analysis. Récupéré 4/avril/2010, [en ligne] http://vlado.fmf.uni-lj.si/pub/networks/pajek/

Callon, M. (1989). La science et ses réseaux : gènese et circulation des faits scientifiques. Paris : Découverte.

Callon, M., Courtial, J. y Turner, W. (1991). La méthode Leximappe : un outil pour l'analyse stratégique du développement scientifique et technique. En B. Vinck, Gestion de la recherche : nouveaux problèmes, nouveaux outils. (207-277). Editions Bruxelles.

Han, J. & Kamber, M. (2001). Data Mining : Concepts and Techniques.(2º ed.) San Francisco ; Morgan Kaufmann Publishers, p. 550.

He, Q., (1999). Knowledge discovery through co-word analysis. Library Trends, 48 (1), 133-159.

Kamada, K. & Kawai, S. (1989). An algorithm for drawing general undirected graphs. Information Processing Letters 31(1), 7-15

Latour, B. (1983). Give me a Laboratory and I Will Raise the World. En K. Knorr-Cetina, & M. Mulkay, Science observed : Perspectives on the Social Study of Science. Londres : Sage.

Law, J., Bauin, S., Courtial, J. P., & Whittaker, J. (1988). Policy and the mapping of scientific change : a co-word analysis of research into enviromental acidification. Scientometrics, 14 (3-4), 251-264.

Law, J. & Whittaker, J. (1992). Mapping acidification research : a test of the co-word method. Scientometrics, 23 (3), 417-461.

Michelet, B. (1988). L'analyse des associations. PhD Thesis. Paris : Université de Paris 7.

Old, L. J. (2002). Information Cartography : Using Your GIS for Non-spatial Data Analysis.Proceedings, GIS 2002 Conference, Indianapolis, IN, Feb. 2002.

Polanco, X. (2008). Transformer l’information en connaissance avec STANALYST. Cadre conceptuel et Modèle. Encontros Bibli : Revista Eletrônica Biblioteconomia y Ciência de la Informaçao, n. esp., 76-91

Polanco, X. (1997). Infometría e Ingeniería del Conocimiento : Exploración de Datos y Análisis de la Información en vista del Descubrimiento de Conocimientos. En Jaramillo, H. ; Albornoz, M. (ed.) El universo de la medición : La perspectiva de la Ciencia y la Tecnología. COLCIENCIAS, CYTED, RICYT. Tercer Mundo Editores. Bogotá, Colombia.

Pino-Díaz, J. (2011). Análisis estratégico de la investigación sobre áreas protegidas en España : ingeniería y cartografía del conocimiento. Unpublished Ph. D. thesis, Universidad de Granada, Granada. Spain. [en ligne] http://hdl.handle.net/10760/15995

Pino-Díaz, J., Jiménez-Contreras, E., Ruíz-Baños, R. et Bailón-Moreno, R., (2011). Evaluación de redes tecnocientíficas : la red española sobre áreas protegidas, según la Web of Science. Revista Española de Documentación Científica. CSIC. 34 (3), 301-333

Pino-Díaz, J., Jiménez-Contreras, E., Ruíz-Baños, R. et Bailón-Moreno, R., (2012). Strategic knowledge maps of the techno-scientific network (SK maps). J. Am. Soc. Inf. Sci., 63 : 796–804. doi : 10.1002/asi.21712

To cite this document :

José Pino-Díaz, Chiadmi-Garcia, Laila, Ruíz-Baños, Rosario, Bailón-Moreno et Rafael, «Centroid method and centrality parameter: application in strategic watch», Intelligences Journal [En ligne], Number 3 , Full text issues , URL : http://lodel.irevues.inist.fr/isj/index.php?id=294

Authors

José Pino-Díaz
Departamento de Griego, Estudios Árabes, Lingüística y Documentación. Facultad de Filosofía y Letras. Universidad de Málaga. Campus de Teatinos. 29071 – Málaga (Espagne)
Laila
Departamento de Ingeniería Química. Facultad de Ciencias. Universidad de Granada. 18071 – Granada (Espagne)
Rosario
Departamento de Biblioteconomía. Facultad de Comunicación y Documentación. Universidad de Granada. Colegio Máximo de Cartuja. 18071 – Granada (Espagne)
Rafael
Departamento de Ingeniería Química. Facultad de Ciencias. Universidad de Granada. 18071 – Granada (Espagne)