Centroid method and centrality parameter: application in strategic watchJosé Pino-Díaz, Chiadmi-Garcia, Laila, Ruíz-Baños, Rosario, Bailón-Moreno et Rafael
This paper explains the centroid method of application in strategic knowledge mapping. The documentary sets of a research field form techno-scientific networks where the links indicate the association relationships between nodes. The Kamada-Kawai algorithm (KK) draws the graphs of networks such that if they were physical systems formed by nodes connected by forces. The KK algorithm draws the networks with a local state of energy minimum. With this algorithm, the Euclidean distance between nodes is proportional to the geodesic distance. Euclidean distance of a node to the centroid is measure of its centrality. This centrality is the parameter that measures the similarity of the nodes with the study topic of techno-scientific network.
The strategic sub-networks are obtained by removing weak links. Nodes of strategic sub-networks are characterized by two parameters: the normalized centrality and the normalized density (the parameter that measures the average strength of the links of the sub-networks). These two parameters are used in the analysis and strategic mapping of techno-scientific network.
This paper presents the results of applying the method to the Spanish scientific research on protected areas during the period 1981-2005. This type of research has a great utility for decision making in Scientific Policy and Evaluation of Science and Technology.
Keywordsstrategic scientific and technological watch, knowledge engineering, information mapping, knowledge mapping, information visualization, decision making, technological and scientific networks, strategic knowledge maps, SK Maps, strategic importance maps, SI Maps
The development in the eighties of the Sociology of Science and Technology (Translation Sociology or Associations Sociology) by Michel Callon (1989) and Bruno Latour (1983); based on conceptual resources, among others, of Michel Serres (Philosophy of Science) and David Bloor (Sociology of Knowledge), have as a result Actor-Network Theory.
The Actor-Network Theory states that in the social construction of a scientific fact involved actors, human and nonhuman. The process of change of the relationships between these actors has as result a network in continuously changing.
The strategic watch of techno-scientific networks involved different disciplines : Knowledge Discovery in Databases (KDD) (Han and Kamber, 2001 ; He, 1999), Engineering Knowledge (Callon, Courtial and Turner, 1991 ; Polanco, 1997) and Information Visualization (Old, 2002).
The clusters of nodes form sub-networks (nodes and links) in techno-scientific networks ; these are identified by the centrality and density parameters. The centrality parameter, or external cohesion index, is a measure indicative of the position of sub-network across the network ; represents the degree of relationship between sub-networks ; can also be interpreted as an measure of similarity with respect to the main network ; a sub-network with high centrality is an sub-network in which the nodes are closer to the centroid, namely, the sub-network has a great affinity with main network. The density index, or parameter of internal cohesion, is measure of internal relations between nodes ; a high density is indicative of strong or enduring relationships that are repeated and prolonged in time.
The strategic watch benefits from the development of knowledge systems (Polanco, 2008). Knowledge systems are useful for competitive intelligence team and technology watch because this technique provide reports, charts, graphs and maps of support decision making (Pino-Díaz et al, 2011).
The objective of this research is to develop a method to map techno-scientific networks using graphics analysis and geographic information systems (GIS). The documentary corpus analysis with knowledge systems provides strategic textual information that can be visualized by 2D and 3D maps, useful in decision making.
To perform the study has been selected a set by papers on protected natural areas collected from IEDCYT (ICYT, ISOC and IME) Spanish bibliographic databases, published since 1981 to 2005. This set of documents is composed of 942 documents, 3595 keywords, 1542 authors and 223 journals. With the Copalred knowledge system (Bailon-Moreno, 2003) has been created the field KWAJ for each database record ; including keywords (KW), authors (A) and magazines (J). Copalred software uses the co-word analysis (Michelet, 1988 ; Law Bauin, Courtial, and Whittaker, 1988 ; Law and Whittaker, 1992) ; the parameters of analysis selected are been : nodes with minimum of occurrence of five, links between nodes with minimum of co-occurrence of three, and size of sub-networks between two nodes and ten nodes.
The techno-scientific network has been drawn with Pajek (Batagelj & Mrvar, 2010), networks analysis software ; Kawada-Kawai (KK) layout algorithm (1989) has been used to drawing the network, taking as viewing options "lines values are similarities" and "lines of different widths" ; the link value is the equivalence index value (association index) between two connected nodes. The coordinates (x, y) obtained from the nodes of the 2D graph are been utilized in implementing the SK Map.
The centroid of the KK network is the node with average coordinates of 2D graph ; is the barycenter.
Centroid coordinates :
The nodal centrality (Ca) is the parameter that measures the similarity of the nodes with the centroid ; namely, the similarity of node with the study topic of the network ; its value is obtained of Euclidean distance to the centroid.
Euclidean distance of the node to the centroid :
To detect the most important or significant sub-networks : first, separate the network components, and second, made successive elimination of weak links in order to increase equivalence index value up to get groups or sub-networks with a number of nodes equal or less than 10 (maximum number of nodes of the sub-networks, appointed in this work). In present paper the minimum value of links was 1819 (the index of equivalences acquire values between 0 and 10,000). Finally, the research areas are obtained grouping the next sub-networks.
The z coordinate in the strategic knowledge map represents the parameter which measures the nodal importance in the network ; this parameter is obtained by adding the nodal normalized centrality (Cna) and the nodal normalized density (DnA). The nodal normalized centrality and the nodal normalized density have values between 0 and 100.
Where DnA is the nodal normalized density of nodes of the sub-network A ; g is the number of edges in the sub-network A ; i,…, j represents the nodes in the sub-network A ; eij the equivalence indices in the sub-network A ; l is the number of nodes ; and Max(Ds) the maximum sub-network density of the sub-networks in the network.
The strategic diagram is made with the ranks values of the centrality (CA) and density (DA) of the sub-networks.
The sub-network strategic importance is a parameter which represents the relevance of a sub-network in the techno-scientific network ; is obtained by the product of the C constant by the sum of the centrality and density ranks of the sub-network. The C constant value is: C = 1 for sub-networks located in the quadrant 1 of the strategic diagram (right upper quadrant) ; C = 0.75 for the sub-networks located in the quarter 2 (right lower quadrant) ; C = 0.50 for the sub-networks located in the quarter 3 (left upper quadrant) ; and C = 0.25 for sub-networks located in the quarter 4 (left lower quadrant).
The SK Map has been made using GIS ArcView. In de SK Map the reference central point is the centroid and the measure parameters the nodal centrality and the nodal importance. The SK Maps can view in 3D and virtual reality.
The sub-networks position in the strategic diagram is displayed in SK Maps using different symbols and colors for the nodes of different sub-networks ; is obtained the Strategic Importance Map (SI Maps).
The figure 1 shows the sub-networks clusters (strategic research areas) of the main component.
The strategic diagram is made with the ranks values of the mean nodal normalized centrality and mean nodal normalized density of the sub-networks (Fig. 2).
The sub-network strategic importance parameter is used for made the ranking of sub-networks by strategic importance. The values of strategic importance parameter are assigned to nodes of sub-networks and three rankings are obtained : SI Keywords ; SI Researchers and SI Journals.
The position of sub-networks in the strategic diagram is displayed in the SK Maps using symbols ; is obtained the Strategic Importance Map (SI Map) (Fig. 3).
Finally are been obtained 3D images of SK Maps (Fig. 4).
The centroid method of techno-scientific network and the nodal centrality is developed for mapping strategic knowledge. Co-words analysis, network analysis, information visualization (KK graphs, SK Maps and SI Maps) has been used for text mining of a techno-scientific documentary corpus. The technology and methodology of analysis graphs and geographic information systems, GIS, has been used for made SK Maps of techno-scientific networks.
The SK Maps and SI Maps use the landscape visual metaphor in textual information visualization. In the SK Maps the nodal proximity to the centroid is measure of the nodal affinity with the study general topic ; the nodal importance is indicated by its altitude. In the SI Maps the nodal strategic importance is indicated by symbols that indicate the sub-network position in the strategic diagram.
The SK and SI Maps facilitate the decision making in techno-scientific watch.
Bailón-Moreno, R. (2003). Ingeniería del conocimiento y vigilancia tecnológica aplicada a la investigación en el campo de los tensioactivos. Desarrollo de un modelo ciencimétrico unificado. Unpublished Ph. D. thesis, Universidad de Granada, Granada. Spain
Batagelj, V. & Mrvar, A. (2010). Networks/Pajek. Pajek Program for Large Network Analysis. Récupéré 4/avril/2010, [en ligne] http://vlado.fmf.uni-lj.si/pub/networks/pajek/
Callon, M. (1989). La science et ses réseaux : gènese et circulation des faits scientifiques. Paris : Découverte.
Callon, M., Courtial, J. y Turner, W. (1991). La méthode Leximappe : un outil pour l'analyse stratégique du développement scientifique et technique. En B. Vinck, Gestion de la recherche : nouveaux problèmes, nouveaux outils. (207-277). Editions Bruxelles.
Han, J. & Kamber, M. (2001). Data Mining : Concepts and Techniques.(2º ed.) San Francisco ; Morgan Kaufmann Publishers, p. 550.
He, Q., (1999). Knowledge discovery through co-word analysis. Library Trends, 48 (1), 133-159.
Kamada, K. & Kawai, S. (1989). An algorithm for drawing general undirected graphs. Information Processing Letters 31(1), 7-15
Latour, B. (1983). Give me a Laboratory and I Will Raise the World. En K. Knorr-Cetina, & M. Mulkay, Science observed : Perspectives on the Social Study of Science. Londres : Sage.
Law, J., Bauin, S., Courtial, J. P., & Whittaker, J. (1988). Policy and the mapping of scientific change : a co-word analysis of research into enviromental acidification. Scientometrics, 14 (3-4), 251-264.
Law, J. & Whittaker, J. (1992). Mapping acidification research : a test of the co-word method. Scientometrics, 23 (3), 417-461.
Michelet, B. (1988). L'analyse des associations. PhD Thesis. Paris : Université de Paris 7.
Polanco, X. (2008). Transformer l’information en connaissance avec STANALYST. Cadre conceptuel et Modèle. Encontros Bibli : Revista Eletrônica Biblioteconomia y Ciência de la Informaçao, n. esp., 76-91
Polanco, X. (1997). Infometría e Ingeniería del Conocimiento : Exploración de Datos y Análisis de la Información en vista del Descubrimiento de Conocimientos. En Jaramillo, H. ; Albornoz, M. (ed.) El universo de la medición : La perspectiva de la Ciencia y la Tecnología. COLCIENCIAS, CYTED, RICYT. Tercer Mundo Editores. Bogotá, Colombia.
Pino-Díaz, J. (2011). Análisis estratégico de la investigación sobre áreas protegidas en España : ingeniería y cartografía del conocimiento. Unpublished Ph. D. thesis, Universidad de Granada, Granada. Spain. [en ligne] http://hdl.handle.net/10760/15995
Pino-Díaz, J., Jiménez-Contreras, E., Ruíz-Baños, R. et Bailón-Moreno, R., (2011). Evaluación de redes tecnocientíficas : la red española sobre áreas protegidas, según la Web of Science. Revista Española de Documentación Científica. CSIC. 34 (3), 301-333
Pino-Díaz, J., Jiménez-Contreras, E., Ruíz-Baños, R. et Bailón-Moreno, R., (2012). Strategic knowledge maps of the techno-scientific network (SK maps). J. Am. Soc. Inf. Sci., 63 : 796–804. doi : 10.1002/asi.21712