A Model to Represent and Process Scientific Knowledge in Biomedical Articles with Semantic Web Technologies

 
CONTINUE READING
A Model to Represent and Process Scientific Knowledge in Biomedical Articles with Semantic Web Technologies
86                                                                                                        Knowl. Org. 43(2016)No.2
                   C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

     A Model to Represent and Process Scientific Knowledge
     in Biomedical Articles with Semantic Web Technologies†
                              Carlos H. Marcondes* and Leonardo C. da Costa**
                Post-graduate Program of Information Science, Federal Fluminense University,
                         R. Tiradentes, 148, Ingá, CEP 24210-510, Niterói, RJ, Brazil
                            *, **

                           Carlos Henrique Marcondes is Professor in the Department of Information Science and in the Information
                           Science Postgraduate Program at Federal Fluminense University, Rio de Janeiro, Brazil. He is an associate re-
                           searcher of CNPq, The Brazilian Council for Scientific Development. His research interests are conceptual
                           modeling, scientific communication, theory of classification and ontological analysis.

                           Leonardo Cruz da Costa is Professor in the Department of Computer Science and in the Information Science
                           Postgraduate Program at Federal Fluminense University, Rio de Janeiro, Brazil. His research interests are con-
                           ceptual modeling, scientific communication, text mining and semantic annotation.

                           Marcondes, Carlos H. and Leonardo C. da Costa. 2016. “A Model to Represent and Process Scientific Knowl-
                           edge in Biomedical Articles with Semantic Web Technologies.” Knowledge Organization 43 no. 2: 86-101. 60 refer-
                           ences.

                             Abstract: Knowledge organization faces the challenge of managing the amount of knowledge available on the
                             Web. Published literature in biomedical sciences is a huge source of knowledge, which can only efficiently be
                             managed through automatic methods. The conventional channel for reporting scientific results is Web elec-
                             tronic publishing. Despite its advances, scientific articles are still published in print formats such as portable
                             document format (PDF). Semantic Web and Linked Data technologies provides new opportunities for com-
                             municating, sharing, and integrating scientific knowledge that can overcome the limitations of the current print
                             format. Here is proposed a semantic model of scholarly electronic articles in biomedical sciences that can
                             overcome the limitations of traditional flat records formats. Scientific knowledge consists of claims made
                             throughout article texts, especially when semantic elements such as questions, hypotheses and conclusions are
                             stated. These elements, although having different roles, express relationships between phenomena. Once such
                             knowledge units are extracted and represented with technologies such as RDF (Resource Description Frame-
work) and linked data, they may be integrated in reasoning chains. Thereby, the results of scientific research can be published and shared
in structured formats, enabling crawling by software agents, semantic retrieval, knowledge reuse, validation of scientific results, and identi-
fication of traces of scientific discoveries.

Received: 5 August 2015 Revised: 21 December 2015; Accepted 29 December 2015

Keywords: scientific knowledge, models, knowledge units, biomedical research articles, semantic web, linked open data, RDF, PDF

† This article is an extended version of the paper presented in the Workshop on Knowledge Organization and Semantic Web, German
  ISKO e.V., SEMANTiCS, Leipzig, Germany, 1st Sept. 2014. Our thanks to the Brazilian grant agencies CNPq and CAPES.

1.0 Introduction                                                                    an electronic metaphor of the print publishing system
                                                                                    used throughout the twentieth century and is still based
Since the rise of the first scientific journal—The Proceedings                      on linear text formats such as portable document format
of the Royal Society—in the seventeenth century, scientific                         (PDF). The textual format of articles suitable for human
articles have become privileged channels of scientific                              reading limits the possibilities for automatic processing of
communication. The content of scientific articles is sub-                           their contents. Tasks such as knowledge reuse, discovery,
mitted to critical reading, inquiry, and citation through a                         gap analysis, contradictions and agreements in knowledge,
long social process until it becomes public knowledge.                              and validation of scientific results, all demand human ef-
The current scholarly Web publishing environment is still                           fort. The availability of automatic tools to assist such tasks

                                                        https://doi.org/10.5771/0943-7444-2016-2-86
                                                 Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                          Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
A Model to Represent and Process Scientific Knowledge in Biomedical Articles with Semantic Web Technologies
Knowl. Org. 43(2016)No.2                                                                                                                    87
C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

is increasingly important as the number of scientific arti-                      mantic type. The UMLS SN uses fifty-four relation types
cles published in digital formats increases and scientists in                    to express the semantic relations between concepts in se-
their daily work have to process results from different ar-                      mantic type hierarchies. The UMLS SN determines the al-
ticles and sources. For example, the PubMed repository                           lowed relations between semantic types. Although this
currently holds over 23 million articles. Scientific articles                    semantically rich schema of representing the content of
are also spread across various information resources such                        articles by relations is supported by the UMLS, the biblio-
as digital libraries, electronic journal systems, and reposito-                  graphic record models in databases such as Medline are
ries. According to Renear and Palmer (2009), scientists are                      not able to explore its full potential. For example, two of
increasingly using what they call strategic reading to cope                      the UMLS semantic types are “Pharmacologic Substance”
with the amount of literature being published. Research                          (UMLS unique identifier T121) and “Pathologic Func-
tasks demand new tools for information discovery, re-                            tion” (UI T046). Relationships between these two con-
trieval, and content comparison in very specific, precise,                       cepts can be quite different, a pharmacologic substance
and meaningful ways.                                                             can “cause” (UI T147) a pathologic function, or a phar-
     Although modern bibliographic information retrieval                         macologic substance can “prevent” (UI T148) a patho-
(IR) systems exploit the potential of information technol-                       logic function. These two different relationships can only
ogy, they are not yet used to directly process the knowl-                        be expressed in one way in Boolean algebra as “Pharma-
edge embedded in the text of scientific articles. Since the                      cologic Substance AND Pathologic Function.”
MARC (MAchine-Readable Cataloging) record was estab-                                 Many different relationships involving scientific articles
lished in 1960, bibliographic record models have changed                         can be identified: bibliographic or citation relations; rela-
little. A typical bibliographic record is comprised of a flat                    tions with datasets holding raw results of scientific ex-
set of unconnected database fields for content descrip-                          periments or databases such as GenBank, DrugBank, Ar-
tion, keywords or descriptors, or other attributes such as                       rayExpress, PhenomicDB; internal relations between parts
author, journal title, publication date etc., each having an                     of an article—semantic components—such as a problem,
equal weight for retrieval purposes. Content access to                           question, hypothesis, methodology, results or conclusion;
documents in modern bibliographic information retrieval                          relations with terminological knowledge bases or ontolo-
systems is still achieved by matching user queries com-                          gies such as UMLS or GO; relations with grant agencies
prised of keywords connected by Boolean operators to                             that support the research; relations with claims within an
keywords or attribute values presented in the bibliographic                      article and across articles; relations within two different
records, a technology similar to early bibliographic re-                         bibliographic sets (literature-related discovery methodolo-
trieval and library automation systems of the 1960s and                          gies) (Swanson et al., 2006); and relations with annotations
1970s. The conventional bibliographic records do not                             or comments made about an article. Now the semantic
show explicit relations between elements comprising the                          web and linked open data environments provide means of
content of documents they represent. This state of affairs                       making explicit all these relationships.
did not change even with the rise of formats such as                                 The current citation-based information retrieval sys-
Dublin Core, developed to manage electronic scientific                           tems and the print model of publication constitute closed
publications.                                                                    systems where scientific articles are isolated from the
     Scientific knowledge aims at universality and necessity.                    mainstream Web and thus are barriers to data reuse, shar-
Such characteristics make this kind of knowledge suscep-                         ing, integration, and synthesis. Within such environments,
tible to heavy reuse. Miller (1947, 310) states that “science                    those relations are implicit, informal, and are not coded in
is a search after internal relations between phenomena.”                         machine-processable formats. Scientific publications were
Scientific knowledge, as it appears in the text of scientific                    first recorded in bibliographic databases. Based on these
articles, consists of claims made by authors throughout                          databases, citation models and citation networks were de-
the article text, synthesized in the article’s conclusions.                      veloped as tools to understand and manage the develop-
These claims comprise the knowledge in scientific articles.                      ment of science (Garfield et al., 1964). This situation can
They are highly reliable knowledge units, as they are vali-                      now be overcome within the scope of the semantic web
dated by the peer-review process and are the result of an                        linked open data platform. These technologies offer the
experiment described and tested in the article.                                  possibility of developing a richer and more multifaceted
     Compared to the poor expressiveness of the three                            scientific knowledge environment where navigating
Boolean operators, AND, OR, and NOT, the Unified                                 throughout a citation network will be only one of the
Medical Language System (UMLS) Semantic Network                                  many possibilities.
(SN), the classification schema of the UMLS National In-                             The semantic web (Berners-Lee et al., 2001) and linked
stitutes of Health Metathesaurus, organizes every concept                        open data technologies (Bizer et al., 2009) constitute a
in hierarchy trees, each having as its root a top level se-                      step forward from conventional information retrieval en-

                                                     https://doi.org/10.5771/0943-7444-2016-2-86
                                              Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                       Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
A Model to Represent and Process Scientific Knowledge in Biomedical Articles with Semantic Web Technologies
88                                                                                                       Knowl. Org. 43(2016)No.2
                  C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

vironments. The content of a Web document is no longer                           were inputs to the model. The different types of scien-
a matter of a simple keyword match, as in conventional                           tific articles according to their reasoning patterns found
computational environments since the 1960s, but instead                          in this literature were classified and incorporated in the
comprises structured sets of concepts connected by pre-                          model. The model was tested and adjusted by analysing
cise meaning relations expressed in RDF (Resource De-                            eighty-nine articles in biomedical sciences with the aim of
scription Framework) and RDF schema directly published                           identifying the semantic components of scientific meth-
and available through the Web. Such a rich knowledge                             odology, how they appeared in the texts of the articles,
representation schema enables software agents to perform                         and reasoning patterns and sequences that combine these
“inferences” and more sophisticated tasks based on article                       elements. Details of this empirical analysis are related in
content.                                                                         Marcondes et al. (2014). The model is graphically pre-
    These technologies provide new opportunities for                             sented as a UML (Unified Modeling Language) class dia-
communicating, sharing, reusing, interlinking, and inte-                         gram, developed using the NClass software tool. Seman-
grating scientific knowledge published in digital formats                        tic records in RDF were validated using W3C Validator
that may overcome the limits of the current print format                         services (http://www.w3.org/RDF/Validator/); screens
used for the publication of scientific results, which is only                    with graphs generated from RDF records were captured
suitable for reading and processing by people. We are now                        from the same service.
beginning to use these technologies for sophisticated tasks
such as knowledge discovery, knowledge comparison, and                           3.0 Bases of the model
integration of multiple sources, which facilitates inference
capabilities among different and autonomous information                          In this section, the theoretical and empirical bases used
resources.                                                                       for developing the model are presented and discussed. To
    The aim of this article is to propose a semantic model                       achieve these goals, first we discuss how to identify a
of scientific articles that goes beyond conventional meta-                       knowledge unit in the text of scientific articles; after that
data formats such as MARC and Dublin Core. The pro-                              we discuss the question of how to model a context where
posed model enables the management of article content                            such knowledge units can be safely used in valid reason-
not just for retrieval purposes by humans, but also for                          ing chains.
automatic reasoning. That means not only enhanced pos-                               What are the methods for achieving truth in science?
sibilities for semantic retrieval by human users, but also                       In the modern age, an important contribution to this dis-
the enabling of the content of scientific articles to be in-                     cussion was the proposal of the scientific method, postu-
tegrated in inference chains by computer programs and to                         lated by Francis Bacon (1973) (among others). In opposi-
be queried as a database. The model also addresses and                           tion to medieval scholastics, Bacon emphasized the im-
outlines a wholly new scientific publishing environment,                         portance of observational experiments to set general laws
one that exploits the possibilities created by semantic web                      in science. His reasoning method of deriving general
and linked open data technologies (Shadbolt et al., 2006).                       statements from a particular number of observational
The model aims also at providing guidelines for the devel-                       cases was called induction.
opment of technological solutions that can, partially or                             Today’s version of scientific method is called the hy-
completely, implement the model’s components. The arti-                          pothetico-deductive method, largely used in experimental
cle is organized as follows: following the introduction, sec-                    sciences such as the biomedical sciences. The method
tion 2 discusses the theoretical foundations of the model                        consists of giving a problem, proposing a feasible hy-
and related works; section 3 presents and explains the                           pothesis, deducing consequences of the hypothesis and
model; section 4 discusses model implementations so far                          developing experiments to test it. The results of the ex-
and the model’s potential to enhance information re-                             periments confirm or reject the hypothesis.
trieval, knowledge discovery, identification of discoveries                          An hypothesis is an essential component of the hy-
in science; and section 5 presents concluding remarks and                        pothetico-deductive method. An hypothesis expresses a
perspectives of research development.                                            contingent relation between phenomena (Marconi and
                                                                                 Lakatos 2004, 137). Research in library and information
2.0 Materials and methods                                                        science, especially in domains such as indexing languages,
                                                                                 coordinated indexing systems, and information retrieval,
The insight to develop such a model came from literature                         gives special attention to relationships as keys for repre-
on philosophy of science and scientific methodology.                             senting meaning. Farradane’s (1980) relational indexing
Literature on rhetoric of scientific papers and different                        approach proposed “meaning, considered as relations be-
reasoning patterns found in them (Bezerman 1988; Gross                           tween terms.” According to Brookes (1980), “knowledge
1990; Hutchins 1977; Nwogu 1997; Skelton 1994) also                              is a structure of concepts linked by their relations and in-

                                                     https://doi.org/10.5771/0943-7444-2016-2-86
                                              Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                       Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
A Model to Represent and Process Scientific Knowledge in Biomedical Articles with Semantic Web Technologies
Knowl. Org. 43(2016)No.2                                                                                                                    89
C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

formation is a small part of such a structure.” Sheth et al.                     “clinical questions,”’ etc. All of these labels are related to
(2003) state, “relationships are fundamental to seman-                           fundamental issues in scientific inquiry.
tics—to associate meaning to words, items and entities.                              In the model proposed, scientific knowledge units
They are a key to new insights. Knowledge discovery is                           consist of the establishment of new relationships be-
about discovery of new relationships.”                                           tween phenomena. A phenomenon can be defined as a
    Although a complex phenomenon, scientific reason-                            “perceptible fact, a sensible occurrence” (Bunge 1998,
ing, as recommended by the scientific method and ex-                             173). Relationships in their simplest form could be mod-
pressed in the texts of scientific articles, plays an essential                  eled as triples of  
role in communication science, i.e. the validation of the                        . In a scientific article relationships may
knowledge contained in any article, thus enabling a scien-                       appear in different semantic components throughout the
tist to reproduce the steps taken by the author in any ex-                       article text, such as within the problem that the article
periment. The need for this rigid protocol when commu-                           addresses as a contingent relationship; as a question, in
nicating research results is stated by The International                         which either one of the two relata or the type of relation
Committee of Medical Journals Editors (2003):                                    is unknown; in the hypothesis as a hypothetical relation-
                                                                                 ship; or in the conclusion as an empirically tested rela-
   The text of observational and experimental articles                           tionship. Frequently, the conclusion also poses new ques-
   is usually (but not necessarily) divided into sections                        tions. “Questions,” “Hypothesis,” and “Conclusion” are
   with the headings Introduction, Methods, Results,                             the semantic components of the proposed model that
   and Discussion. This so-called “IMRAD” structure                              synthesize these issues. Such elements, according to the
   is not simply an arbitrary publication format, but                            scientific method, are also related in a reasoning chain. A
   rather a direct reflection of the process of scientific                       question evokes a tentative hypothesis of how to solve it;
   discovery.                                                                    the hypothesis must be tested by a practical experiment
                                                                                 which confirms or denies it; the results of the experiment
In addition to the IMRAD structure, in 1987 the Ad Hoc                           are synthesized in a conclusion. These semantic compo-
Working Group for Critical Appraisal of Medical Litera-                          nents converge to justify, support and guarantee the arti-
ture recommended the use of structured abstracts to im-                          cle’s conclusion.
prove the information of clinical articles. Since then an                            The conclusion is the essential semantic component, as
increasing number of biomedical journal editors have                             it synthesizes the knowledge content of an article and its
adopted the structure for abstracts. Structured abstracts                        contribution to a scientific domain. According to Samwald
are one of the roots of the model proposed here. Their                           (2009) “The conclusion sections of biomedical abstracts
adoption can improve scientific communication and reli-                          seem like a gold-mine for automated key assertion identi-
ability of the results reported in each paper. In the Struc-                     fication, since the relevant portion of text can be identi-
tured Abstract Labels Research Dataset, a large number                           fied easily.” However, in the scope of a recently-published
of structured abstract labels used in different biomedical                       article, the conclusion is a provisional knowledge unit, as it
journals can be found. Their variety stresses the impor-                         is at least validated by the experiment reported in the arti-
tance of making explicit the decisions made in the process                       cle and by the peer-review process by which the article
of scientific inquiry to improve scientific communication                        was accepted for publication. Semantic components such
and reliability of the results reported in the paper (Salager-                   as questions and hypotheses are important as they permit
Meyer 1991). The National Library of Medicine, too,                              the evolution of a claim to be traced. Other elements have
adopted a five label categories standard for structured ab-                      rhetorical functions, as extensively discussed in Skelton
stracts: background, objective, methods, results, and con-                       (1994) and Nwogu (1997), or serve to describe methodo-
clusion. Structured abstracts are a step towards a semantic                      logical options, the experiment performed, its context, or
format for scientific papers. A further step in this direc-                      display more clearly the results obtained. Other elements
tion would be to represent these semantic components in                          include the proposed objectives of the article and the de-
a machine-readable format so they can be processed by                            scription of the experiment performed.
programs.
   In the Structured Abstract Labels Research Dataset                            4.0 Proposal
previously mentioned, there are some suggested labels
used in structured abstracts that are found in a variety of                      The idea of a bibliographic record model is the result of a
medical journals; e.g., such as “problem addressed,” “basic                      long tradition that harkens back to Panizzi, Jewett, Cutter,
problem and aim of study,” “basic problems and objec-                            and Lubetzky (Bianchini and Guerrini 2009). Although
tives,” “questions of the study,” “questions under study,”                       these pioneers posed the differences between a work as an
“clinical questions,” “hypothesis,” “hypothesis tested,”                         abstract creation and the book that encompasses it, tech-

                                                     https://doi.org/10.5771/0943-7444-2016-2-86
                                              Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                       Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
A Model to Represent and Process Scientific Knowledge in Biomedical Articles with Semantic Web Technologies
90                                                                                                       Knowl. Org. 43(2016)No.2
                  C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

nological limitations of the earlier bibliographic systems                          Since the 1970s, many initiatives have been undertaken
imposed a simplified representation restricted to a brief                        to organize and standardize knowledge in biomedical sci-
description and the main access points. This representa-                         ences. Initially those initiatives took into account the ter-
tion encompasses different independent entities that were                        minological knowledge necessary to cope with indexing
mixed and simplified as mere record fields. This scenario                        biomedical literature. Recently, many models of scientific
began to change in 1998 with the publication of IFLA’s                           knowledge in biomedical sciences have been proposed to
Functional Requirements for Bibliographic Records                                address biomedical knowledge directly (OBO Foundry).
(1998).                                                                          To address the aims of these models, or ontologies, built
    Functional Requirements for Bibliographic Records, better                    on the principles of OBO Foundry, models are created
known as FRBR (IFLA 1998), is a conceptual model,                                with the epistemic approach of scientific realism, which
based in Chen’s (1976) entity-relationship (ER) model. An                        means that all instances of entities represented in such on-
ER model comprises the entities “a ‘thing’ which can be                          tologies must have real existence in time and space
distinctly identified,” the relationships “an association                        (Cleusters and Smith 2006).
among entities,” and the attributes of entities or relation-                        The model proposed makes quite a different assump-
ships in a specific domain. “The information about an en-                        tion. It is committed to modeling and discovering the
tity or a relationship is obtained by observation or meas-                       knowledge units that comprise scientific reasoning. Ac-
urement, and is expressed by a set of attribute-value                            cording to Shultz and Jassen (2013, 8), domain models in
pairs” (Chen 1976, 12). In the bibliographic domain enti-                        biomedical sciences may include four kinds of statements:
ties could be a work such as Hamlet, and its author, Wil-                        universal statements, terminological statements, state-
liam Shakespeare, a relationship would be the authorship                         ments about particulars, and contingent statements. We do
of Hamlet by William Shakespeare, an atribute of this                            not intend to model universal statements in biomedical
work entity would be its title, attributes of the author en-                     sciences. Accordingly, the model includes contingent or
tity might be his name and birth and death dates, respec-                        hypothetical knowledge units and knowledge units that are
tively 1564 and 1616. Also, FRBR identifies the relations                        not yet largely accepted in a scientific domain, such as the
between the work and its various expressions such as the                         conclusions of an article. Indeed these semantic compo-
text of a play or a film; their various manifestations, such                     nents that comprise scientific reasoning are the core of
as the 1994 English edition by Penguin Books or the 2007                         the model.
Portuguese edition by L&PM Editora; and the several                                 Scientific claims in articles take the form of relations
items available in a library’s holdings.                                         between phenomena or between a phenomenon and its
    The model presented here was first proposed in 2005                          characteristics. They are expressed linguistically through
and has been continuously improved since then (Marcon-                           propositions (Dahlberg 1995, 10):
des 2005). It addresses a related but a slightly different is-
sue from that addressed by the FRBR model. It extends                                a- “telomere shortening (Phenomenon) causes (Type
conventional bibliographic record models comprised of                                   _of_relation) cellular senescence (Phenomenon)”
descriptive elements such as authors, title, abstract, biblio-                          (Hao et al. 2005);
graphic source, publication date, and content information                            b- “telomere replication (Phenomenon) involves
such as keywords or descriptors and references to cited                                 (Type_of_relation) nontemplate addition of te-
papers. In addition to these bibliographic elements, the                                lomeric repeats onto the ends of chromosomes
model makes explicit and formalizes the claims made by                                  (Phenomenon)?” (Shampay et al. 1984); or
authors throughout the article text. These claims are the                            c- “tetrahymena extracts (Phenomenon) show (Type
basic knowledge units that comprise the scientific contri-                              _of_relation) a specific telomere tranferase activ-
butions of an article. They are not explicitly represented                              ity (Characteristic)” (Greider and Blackburn
nor coded in conventional bibliographic records and are                                 1985).
hard to find in article texts. The model proposes the cod-
ing of those claims in a machine-processable format, ena-                            In formal ontology and knowledge representation lit-
bling their use in tasks that demand intelligent processing.                     erature, both characteristics and relations are called prop-
Once explicit and coded, they may be integrated into rea-                        erties. We opt to differentiate characteristics (also called at-
soning chains, enabling semantic retrieval, knowledge re-                        tributes) as properties, which depend only on one entity
use, validation of scientific results, tracing of scientific                     instance, and relations, as properties that depend on two
discoveries, new scientific insights, and identification of                      or more entity instances. This decision makes our model
knowledge contradictions or inconsistencies. Such a                              compatible with OWL (Web Ontology Language), which
model is designed to be implemented in the semantic web                          has properties as one of its basic building blocks, although
linked open data platform.                                                       it distinguishes data type properties (attributes, characteris-

                                                     https://doi.org/10.5771/0943-7444-2016-2-86
                                              Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                       Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
A Model to Represent and Process Scientific Knowledge in Biomedical Articles with Semantic Web Technologies
Knowl. Org. 43(2016)No.2                                                                                                                       91
C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

tics) from object properties (relations). Furthermore, this                              rates, totally or partially, the hypothesis of an article
decision has to do with the proposed classification of pat-                              or negates it. A conclusion may also be conclusive
terns of reasoning in scientific articles. One type of arti-                             or not yet conclusive;
cle, experimental-exploratory articles (EE), is character-
ized as describing a new phenomenon by proposing rela-                           – Semantic components such as question, hypothesis,
tions, not between two different phenomena, but between                            previous hypothesis, and conclusion are comprised of:
a phenomenon and its characteristics, as are shown in the                          – an antecedent;
following paragraphs.                                                              – a type_of_relation (holding the semantic of the re-
    As explained in the previous section, claims such as                              lation in a domain, for example, in biomedical sci-
basic knowledge units represented in the model, may                                   ences); and
appear in different semantic elements that comprise the                            – the consequent.
model, such as in a question within the problem, in the
hypothesis and in the conclusion. Semantic components                            The antecedent and consequent may be two different phe-
such as questions and hypotheses are important because                           nomena or one single phenomenon and its characteristics.
they enable the tracing of a claim over time. However,                              Articles differ in the way they are built around previ-
the conclusion is the core semantic component of the                             ously-stated hypotheses, or those stated by authors other
proposed knowledge representation schema. There are                              than the authors of the current article, or new, original
other elements in the model which are not represented as                         hypotheses, i.e., those stated by the authors of the cur-
relations, such as the objective and the experiment. The                         rent article. Articles may also differ in documenting an
semantic components that comprise the proposed model                             experiment or just discussing theoretical considerations
are described below.                                                             when comparing previously-stated hypotheses. In the
                                                                                 model proposed, four types of articles are identified by
– The semantic article is a complex digital object with                          the patterns of reasoning they contain: theoretical articles
  conventional bibliographic metadata and with links to                          and experimental articles, which may be just exploratory
  its full-text and to other articles that are cited or that                     articles or employ inductive or deductive reasoning. The
  cite one specific article. Its components are the follow-                      four patterns of reasoning are described as follows:
  ing:
  – the problem the article is addressing and the ques-                          – Theoretical-abductive (TA) articles analyze different,
      tion derived from it;                                                        previous hypotheses, showing their faults and limita-
  – an original or new hypothesis, aimed at addressing                             tions and proposing a new hypothesis; the reasoning is
      the question proposed;                                                       as follows:
  – sometimes authors develop experiments in order to                              – A problem is identified, with the following aspects
      corroborate or negate a previous hypothesis, one                                and data … ;
      which was proposed by some other author;                                     – The previous hypotheses (from other authors) are
  – an empirical experiment is also described with the                                not satisfactory to solve the problem due to the fol-
      aim of observing the phenomenon described it re-                                lowing criticism … ;
      lations with other phenomena and specific charac-                            – Therefore, we propose this new (original) hypothe-
      teristics which is comprised of:                                                sis, which we consider a new pathway to solve the
  – results: tables, figures, and numeric data reporting                              problem.
      the observations made;
  – measurement used;                                                            – Experimental-inductive (EI) articles propose a hy-
  – the specific context where the empirical observa-                              pothesis and develop experiments to test and validate
      tions take place with the following components:                              it; the reasoning is as follows:
      – environment, e.g., a hospital, a daycare center, a                         – A problem is identified, with the following aspects
          high school;                                                                  and data … ;
      – geographical location where the empirical obser-                           – A possible solution to this problem can be based
          vations occur;                                                                on the following new hypothesis … ;
      – time when the empirical observations occur;                                – We developed an experiment to test this hypothesis
      – specific population in which the phenomenon                                     and obtained the following results.
          occurs, e.g., pregnant women, early born babies,
          mice;                                                                  In experimental-inductive articles, a conclusion may be
  – conclusions: a set of propositions made by authors                           mainly one of these alternatives: it corroborates the hy-
      as a result of the findings. A conclusion corrobo-                         pothesis, refutes it, or partially corroborates the hypothe-

                                                     https://doi.org/10.5771/0943-7444-2016-2-86
                                              Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                       Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
92                                                                                                      Knowl. Org. 43(2016)No.2
                 C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

sis. However, in some cases, the conclusion is not one of                          Recently, different ontologies have been proposed
the former; it simply reports intermediate but not conclu-                      with the aim of aggregating semantics processable by
sive results toward the hypothesis corroboration.                               programs to scientific articles in digital format. Among
                                                                                others are CITO, the citation ontology, which aims to
– Experimental-deductive (ED) articles use a hypothesis                         make explicit the reasons for citation in a scientific article,
  proposed by other researchers, cited by the articles’ au-                     and EXPO, ontology for biomedical experiments, which
  thor, and apply it to a slightly different context; the                       is an intermediate-level ontology between the top-level
  reasoning is as follows:                                                      ontology SUMO and domain-specific level ontologies. It
  – A problem is identified, with the following aspects                         aims to formalize knowledge about scientific experi-
     and data … ;                                                               ments’ designs, methodologies procedures, and results.
  – In the literature, the previous hypotheses (by other                           The use foreseen for these ontologies is the annota-
     authors) have been proposed … ;                                            tion of scientific papers or scientific experiments (Solda-
  – We choose the following previous hypothesis … ;                             tova and King 2005). In practical use cases, annotation is
  – We enlarge and recontextualise this hypothesis; we                          a problematic task, as it may be considered as an addi-
     develop an experiment to test it in this new context                       tional task among many others performed by scientists in
     …;                                                                         their everyday work. Although these ontologies are very
  – The experiment shows the following results in this                          expressive in formalizing scientific knowledge, their use
     new context.                                                               in annotation is a strong limitation to their practical use.
                                                                                   The components described in the model, once coded
– Experimental-exploratory (EE) articles usually are not                        in program-understandable formats using semantic web
  hypothesis driven; their objective is to acquire knowl-                       standards, constitute rich knowledge representations,
  edge about a poorly-understood scientific phenome-                            which can enable direct management of knowledge con-
  non by performing an experiment; the reasoning is as                          tained in scientific articles, their use in automatic reason-
  follows:                                                                      ing and inference tasks applied to different and unpre-
  – There is a phenomenon that is poorly understood                             dicted contexts, thus enlarging the possibilities for auto-
     in a scientific domain.                                                    matic processing of the rich digital content now available
  – We developed an experiment that permits the iden-                           throughout the Web.
     tification of the following characteristics of this                           However, the semantic components provided by the
     phenomenon.                                                                model are hard to capture within the scope of the current
                                                                                scholarly electronic publishing environment. We avoid us-
These basic semantic components of scientific articles                          ing them for annotation. Thus, we propose to engage au-
are interrelated and structured. Together with the corre-                       thors in developing a richer content representation of
sponding bibliographic metadata and article text, they                          their articles. To take full advantage of the facilities pro-
form richer article surrogates able to be coded in ma-                          vided by the model, a future scholarly electronic publish-
chine-understandable formats. Once coded, they consti-                          ing framework should be developed, a scholarly elec-
tute single digital objects which may be stored in a digital                    tronic article editor and submission system able to cap-
library or electronic journal publishing system. All these                      ture, formalize, and code the elements provided by the
features are formalized in the semantic model of articles                       model.
(SMA) illustrated in Figure 1.                                                     We propose here some initial steps toward this frame-
                                                                                work. Researchers are accustomed to self-describing their
5.0 Model implementation and potentialities                                     papers when submitting them to a digital library, to a
                                                                                conference, to a digital repository or to a journal system.
This section describes partial implementations of the                           The submission of an article to a journal system is a
model proposed, developed with the aim of demonstrat-                           privileged process during which authors are particularly
ing its feasibility.                                                            motivated to clarify and disambiguate questions about
   Nowadays considerable effort has been expended in                            their articles. In our proposal we take advantage of this
extracting and formalizing knowledge from the unstruc-                          moment. We have developed a prototype system of a
tured text of scientific articles. Within this scope a main                     Web author’s submission interface to a journal system,
focus is the article’s conclusion and its feasibility to be                     which partially implements the model (Costa 2010). In
represented as a relationship using RDF coding (Sam-                            the prototype developed, authors use a Web submission
wald, 2009). This trend could be enhanced if the article                        interface to a journal system to type, in addition to stan-
conclusion as a knowledge unit could be captured as a                           dard metadata, the article conclusions at the moment of
structured RDF triple as in the model proposed.                                 submission and upload of the article text. The system

                                                    https://doi.org/10.5771/0943-7444-2016-2-86
                                             Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                      Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
Knowl. Org. 43(2016)No.2                                                                                                                                                                                                              93
C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

                                                                                                                       Figure1. Semantic Model of Articles. The dashed line delimits the portion of the model that was implemented.

                                                     https://doi.org/10.5771/0943-7444-2016-2-86
                                              Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                       Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
94                                                                                                          Knowl. Org. 43(2016)No.2
                     C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

      Questions                                                     Author 1          Author 2         Author 3            Author 4   Author 5   Author 6
  1   I think that I would like to use this system frequently            5                 3                4                 5          4          4
  2   I found the system unnecessarily complex                           1                 1                1                 1          2          2
  3   I thought the system was easy to use                               5                 2                4                 5          4          4
      I think that I would need the support of a technical
  4                                                                      1                 1                1                 1          1          1
      person to be able to use this system
      I found the various functions in this system were well
  5                                                                      4                 3                4                 5          4          3
      integrated
      I thought there was too much inconsistency in this sys-
  6                                                                      1                 3                1                 1          1          1
      tem
      I would imagine that most people would learn to use
  7                                                                      5                 4                4                 4          4          4
      this system very quickly
  8   I found the system very cumbersome to use                          1                 1                1                 1          3          1
  9   I felt very confident using the system                             4                 3                5                 4          3          4
      I needed to learn a lot of things before I could get going
 10                                                                      4                 1                1                 1          1          1
      with this system

                                                                Table 1. Results of system usability test.

performs natural language processing (NLP) of the con-                                       Despite the difficulties in engaging authors to test the
clusion, breaking it into short pieces of text as it is typed,                            prototype a test was performed with six authors, profes-
formatting it as a relationship. The system interacts with                                sors and a researcher of the Biomedical Institute of our
authors, asking them to validate the relation extracted and                               university. An interview was performed with each author,
the mapping done by the system of concepts found in                                       relative to an article of his or her authorship. Authors
the conclusion to concepts in a domain-specific, termino-                                 were asked to identify within the article text the question,
logical knowledge base. In the case of the prototype de-                                  the hypothesis and the article conclusion. Afterward, au-
veloped, the terminological knowledge base used is the                                    thors were asked to submit the article's conclusion to the
UMLS. The results of this processing are recorded as a                                    prototype. Authors spent an average time of twelve min-
bibliographic record, rich, in semantic content, in which                                 utes and twenty-three seconds to interact with the proto-
scientific claims made by authors throughout the articles                                 type to submit an article. After using the prototype, au-
are expressed by relations. Each article, in addition to be-                              thors were asked to respond to a questionnaire to evalu-
ing published in textual format, has its claims also repre-                               ate system usability (Usability.gov). The System Usability
sented as structured relations and recorded in program-                                   Scale (SUS) provides a quick and dirty reliable tool for
understandable format using semantic web standards                                        measuring the usability. It consists of a ten-item ques-
such as RDF and OWL. These semantic records are also                                      tionnaire with five response options for respondents;
formally related by the author, i.e. mapped and annotated                                 from 5 (Strongly agree) to 1 (Strongly disagree). The pro-
to concepts in a standard terminological knowledge base                                   totype average usability was high, 83.75. Results are as
expressing the author’s own view and judgment of how                                      follows and are shown in Table 1.
the conclusion of the article might be represented in such                                   Besides the conventional bibliographic metadata the
terminology.                                                                              only additional metadata that the prototype interface asks
   Authors are asked to validate the automatic mapping                                    to the authors to type are the article's objective and con-
made by the system, even choosing other terms from a list                                 clusion. In all test cases the prototype was able to formal-
displayed by the system or deciding that there is not any                                 ize the article conclusion as a relationship and the formal-
satisfactory mapping among the options offered; in this                                   ized conclusion was approved by the author.
case, the system assigns “no mapping” to this specific ele-                                  Figures 2-4 illustrate some of the steps involved in the
ment of the relation. The article conclusion, formatted as a                              processing of the following conclusion (Segundo et al.
relation and with terms of its “antecedent,” “type_of_                                    2004): “The results presented herein emphasize the im-
relation” and “consequent” annotated by the author to                                     portance to accomplish systematic serological screening
terms in the UMLS is then recorded as a rich article surro-                               during pregnancy in order to prevent the occurrence of
gate.                                                                                     elevated number of infants with congenital toxoplasmo-

                                                              https://doi.org/10.5771/0943-7444-2016-2-86
                                                       Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                                Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
Knowl. Org. 43(2016)No.2                                                                                                                95
C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

                                          Figure 2. The author specifies the article conclusion.

                                       Figure 3. The article conclusion is formatted as a relation.

sis.” The screens were captured from the prototype sys-                              Both RDF documents in Figure 5 are named-graphs,
tem developed in Costa (2010).                                                   meaning that their triples are identified by a URI. Even a
    This framework enables the posterior use of these                            partial implementation of the record model proposed in
surrogates to compare their content to terminologies like                        RDF, where the only semantic component captured is the
the UMLS in order to identify related claims in different                        conclusion, will facilitate more expressive semantic re-
articles or traces of discoveries at the moment of article                       trieval from a knowledge network enabling queries such
publication, which may be advantageous when compared                             as the following:
to methods such as article citations.
    Figures 5-6 show how the conclusion “telomere                                – Which other articles have hypotheses suggesting HPV
replication [Antecedent] involves [Type_of_relation] a                             as the cause of cervical neoplasias in women?
terminal transferase-like activity [Consequent],” found in                       – Which articles have hypotheses suggesting other causes
(Greider and Blackburn. 1987), may be formated in RDF.                             of cervical neoplasias different from HPV in women?

                                                     https://doi.org/10.5771/0943-7444-2016-2-86
                                              Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                       Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
96                                                                                                      Knowl. Org. 43(2016)No.2
                 C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

                   Figure 4. Authors are asked to map/annotate concepts in the article’s conclusion to UMLS terms.

– Which articles have hypotheses suggesting HPV as the                          ticles analyzed. The results are the following: 27 articles
  cause of cervical neoplasias in groups different from                         were classified as as experimental-inductives (EI), 44 as ex-
  women?                                                                        perimental-deductives (ED), 15 as experimental-explora-
– Which articles have hypotheses suggesting HPV as the                          tories (EE), and 3 as theoretical-abductives (TA). Details
  cause of pathologies different from neoplasias?                               can be found at Marcondes et al. (2014).
– Which articles have hypotheses suggesting HPV as the                              These general frameworks may be used to identify
  cause of cervical neoplasias in different contexts (not                       discoveries reported in scientific articles based on two
  in women from Federal District, Brazil)?                                      aspects: the evolution of their rhetorical patterns within a
                                                                                chronological series of articles reporting an important
The model also enables queries that may indicate new dis-                       scientific discovery and by comparing the content of the
coveries, for example, new causes of cellular senescence:                       articles’ conclusions with terminological knowledge
                                                                                bases. We found a characteristic evolution of these
– Which experimental-inductive articles propose (Antece-                        patterns in the Lasker Medical Award 2006 articles group,
  dent?) causes (Type_of_relation) to cellular senescence                       as predicted by authors on scientific discovery. Articles
  (Consequent) that are not mapped to UMLS concepts?                            within this group have a specific reasoning pattern when
– Is there any confirmation of the hypothesis that “Sev-                        analyzed chronologically; additionally, the mapping/non-
  eral aspects of both the structural and dynamic proper-                       mapping of terms in the conclusions to terminological
  ties of telomeres (Antecedent) led to the proposal that                       knowledge bases as an indicator, as proposed, also shows
  telomere replication involves (Type_of_relation) non-                         a specific pattern within the same chronological series.
  template addition of telomeric repeats onto the ends of                       Articles published before the publishing of the article
  chromosomes (Consequent)” (Shampay et al. 1984)?                              which marks the discovery of the telomerase enzyme and
– Who first maintained, and when, that “the RNA com-                            named it in 1985 were all of the experimental-exploratory
  ponent of telomerase (Antecedent) may be directly in-                         (EE) type and achieved non-mapping (NM) of the con-
  volved in (Type_of_relation) recognizing the unique                           cepts in their conclusions to MeSH (Medical Subject Head-
  three-dimensional structure of the G-rich telomeric oli-                      ings) concepts. Experimental-exploratory (EE) articles
  gonucleotide primers (Consequent)” (Greider and                               seem to characterize first steps toward the discovery of
  Blackburn 1987)?                                                              new phenomena. After the discovery of telomerase and
                                                                                the coinage of its scientific name in 1985, the first non-
The article’s types and corresponding reasoning patterns                        experimental-exploratory (EI and ED) articles appear.
described in the model enabled classification of the 89 ar-                     Furthermore, just after 1986 appear the first partially

                                                    https://doi.org/10.5771/0943-7444-2016-2-86
                                             Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                      Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
Knowl. Org. 43(2016)No.2                                                                                                                   97
C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

   title
   creator
   subject
   date
     http://art_id/conclusion
   
    telomere replication
   
      http://www.nlm.nih.gov/research/umls/CUI01
   
    involves
   
      http://www.nlm.nih.gov/research/umls/CUI02
   
             terminal transferase-like activity
   
      http://www.nlm.nih.gov/research/umls/CUI03
   
           Figure 5. An article and its conclusion, represented as two RDF documents. CUI means “concept unique identifier”

mapped (PM) articles. Details can be found in Malheiros                          concept, which is identified by its Concept Unique Iden-
and Marcondes (2013).                                                            tifier (CUI) as “telomerase activity,” which is a generic
    The model might also enable researchers to find arti-                        term relative to the first; a software agent might infer a
cles with related claims, from which new knowledge may                           new claim, i.e., that (maybe) “telomere shortening” “is as-
be inferred, as in the following example. Suppose an arti-                       sociated with” “cancer.” The claim is trusted based on
cle’s conclusion claims that “telomere shortening causes                         the evidence presented in the experiments described in
cellular senescence,” while another article’s conclusion                         both articles and by the judgment of journal referees,
claims that “telomerase activity is associated with cancer.”                     who certified that both articles had sufficient scientific
The concepts “telomere shortening” and “telomerase ac-                           quality to merit publication. Figure 7 illustrates this case.
tivity” are both mapped, i.e., linked, to the same UMLS

                                                     https://doi.org/10.5771/0943-7444-2016-2-86
                                              Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                       Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
98                                                                                                        Knowl. Org. 43(2016)No.2
                   C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

                                 Figure 6. RDF data set representing a semantic article with its conclusion.

   These examples show how the knowledge representa-                               ing towards a semantic format. The adoption of structured
tion schema proposed may improve semantic retrieval and                            abstracts by many biomedical journals is a move in this di-
the use of knowledge in different and unpredicted con-                             rection. Within such a context, in light of the new com-
texts.                                                                             puter-driven scientific methods, the question arises of
   The model proposed is broad and aims to encompass                               whether the semantic components of scientific methodol-
the complete scientific publishing and information retrieval                       ogy, such as problem, question, and hypothesis, are still
environment throughout the Web. For this reason, it is                             valid as necessary stages in the construction of solid scien-
very difficult to achieve a comprehensive test which could                         tific claims and bases of scientific knowledge. The amount
validate the model as whole. Such a test also depends on                           of scientific literature published throughout the Web is be-
the development of software tools that have not been                               coming increasingly vast and complex. It will be necessary
developed yet. More tests and experiments must be                                  for scientists to have enhanced software tools in order to
achieved to arrive at full potencialities. Our research group                      process this content. The Web provides a wholly new plat-
has not been able to fully develop the model to the                                form for publishing, sharing, and interlinking scientific ac-
potentialities outlined here. We consider the model out-                           tivities and data. At present, the distinction between elec-
lined a starting point that can be discussed and built upon                        tronic journals and databases also are blurring. This inte-
by the scientific community.                                                       grated knowledge network could be crawled by software
                                                                                   agents, thus helping scientists in semantic retrieval, knowl-
6.0 Concluding remarks                                                             edge reuse, validation of scientific results and identification
                                                                                   of traces of scientific discoveries, new scientific insights,
Several signs indicate that digital scientific articles are over-                  and knowledge contradictions or inconsistencies.
coming the limitations of the paper-print model that pre-                              In this paper we propose a semantic model of schol-
vailed for centuries in scientific journals and now are head-                      arly electronic articles in biomedical sciences that can

                                                       https://doi.org/10.5771/0943-7444-2016-2-86
                                                Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                         Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
Knowl. Org. 43(2016)No.2                                                                                                                   99
C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

                                             Figure 7. Related claims in two semantic articles

overcome the limitations of traditional flat records for-                           Abstracts of Clinical Articles.” Annals of Internal Medi-
mats. We also describe some partial implementations of                              cine 106, no. 4: 598-604.
the model that demonstrate its feasibility and utility to                        ArrayExpress. http://www.ebi.ac.uk/arrayexpress/
manage scientific knowledge. Knowledge organization                              Bacon, Francis. 1973. Novum Organum. São Paulo: Abril Cul-
can go beyond just using conventional indexing tech-                                tual.
niques to provide quick access to full-text scientific arti-                     Berners-Lee, Tim, James Hendler and Ora Lassila. 2001.
cles. It can help scientists to directly process the knowl-                         “The Semantic Web.” Scientific American 284, no. 5: 29-37.
edge content of scientific articles and to recognize the                         Bazerman, Charles. 1988. Shaping Written Knowledge: The
reasoning that leads to a scientific discovery. The model                           Genre and Activity of the Experimental Article in Science.
proposed also points to the standardization of a scientific                         Madison: University of Wisconsin Press.
knowledge markup language encompassing the knowl-                                Bianchini, Carlo and Mauro Guerrini. 2009. “From Bib-
edge content of Web-published scientific articles, taking a                         liographic Models to Cataloging Rules: Remarks on
step forward to proposals like those of Murray-Rust and                             FRBR, ICP, ISBD, and RDA and the Relationships Be-
Rzepa (1999; 2002) and Hucka et al. (2003). This opens a                            tween Them.” Cataloging & Classification Quarterly 47,
new perspective in scientific electronic publishing,                                no. 2: 105-24.
knowledge acquisition, storage, processing, and sharing.                         Bizer, Christian, Tom Heath and Tim Berners-Lee. 2009.
                                                                                    “Linked Data-The Story So Far.” International Journal on
References                                                                          Semantic Web and Information Systems 5, no. 3:1-22.
                                                                                 Brookes, Bertram C. 1980. “The Foundations of Infor-
Ad Hoc Working Group for Critical Appraisal of Medical                              mation Science. Part I. Philosophical Aspects.” Journal
  Literature. 1987. “A Proposal for More Informative                                of Information Science 2, nos. 3-4: 125-33.

                                                     https://doi.org/10.5771/0943-7444-2016-2-86
                                              Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                       Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
100                                                                                                         Knowl. Org. 43(2016)No.2
                     C. H. Marcondes and L. C. da Costa. A Model to Represent and Process Scientific Knowledge in Biomedical Articles ...

Bunge, Mario. 1998. Philosophy of Science. New Brunswick,                              Hucka, Michael, et al. 2003. “The Systems Biology
   N.J.: Transaction Publishers.                                                          Markup Language (SBML): A Medium for Representa-
Ceusters, Werner and Barry Smiths. 2006. “A Realism-                                      tion and Exchange of Biochemical Network Models.”
   Based Approach to the Evolution of Biomedical On-                                      Bioinformatics 19, no. 4: 524-31.
   tologies.” In Biomedical and Health Informatics: From Foun-                         Hutchins, John. 1977. “On the Structure of Scientific
   dations to Applications to Policy: AMIA Annual Symposium                               Texts.” In Proceedings of UEA Papers in Linguistics, Sep-
   Proceedings, edited by David W Bates, John H. Holmes                                   tember 5th 1977, Norwich, UK: University of East An-
   and Gilad J. Kuperman. Bethesda, Md.: American                                         glia, 18-39.
   Medical Informatics Association, 121-5. http://www.                                 International Committee of Medical Journals Editors.
   ncbi.nlm.nih.gov/pmc/articles/PMC1839444/                                              2003. http://www.icmje.org
Chen, Peter Pin-Shan. 1976. “The Entity-Relationship                                   IFLA Study Group on Functional Requirements for Bib-
   Model—Toward a Unified View of Data.” ACM Trans-                                       liographic Records.1998. Functional Requirements for Bib-
   actions on Database Systems 1, no. 1: 9-36. http://csc.lsu.                            liographic Records: Final Report. München: K. G. Saur.
   edu/news/erd.pdf                                                                    Malheiros, Luciana Reis and Carlos Henrique Marcondes.
CITO, The citation ontology. http://speroni.web.cs.uni                                    2013. “Identificación de indicios de descubrimientos
   bo.it/cgi-bin/lode/req.py?req=http:/purl.org/spar/                                     científicos en artículos biomédicos mediante análisis de
   cito                                                                                   contenidos.” Revista Española de Documentación Científica
Costa, Leonardo Cruz da. 2010. Uma proposta de processo de                                36, no. 2. doi: http://dx.doi.org/10.3989/redc.2013.2.
   submissão de artigos científicos à publicações eletrônicas semânticas                  915
   em ciências biomédicas. Tese doutorado, Programa de Pós-                            MARC Standards. http://www.loc.gov/marc/
   graduação em Ciência da Informação UFF-IBICT, Ni-                                   Marconi, Marina de Andrade and Eva Maria Lakatos.
   terói., Brazil.                                                                        2004. Fundamentos de Metodologia Científica. 5th ed. São
Dahlberg, Ingetraut. 1995. “Conceptual Structures and                                     Paulo: Atlas.
   Systematization.” International Forum on Information and                            Marcondes, Carlos Henrique. 2005. “From Scientific
   Documentation 20, no. 3: 9-24.                                                         Communication to Public Knowledge: The Scientific
DrugBank. http://www.drugbank.ca/                                                         Article Web Published as a Knowledge Base.” In From
Dublin Core. http://dublincore.org/                                                       Author to Reader: Challenges for the Digital Content Chain:
EXPO. Ontology for Biomedical Experiments. http://                                        Proceedings of the 9th ICCC International Conference on Elec-
   expo.sourceforge.net/                                                                  tronic Publishing, June 8-10, 2005, Katholieke Universiteit
Farradane, J. 1980. “Relational indexing. Part I.” Journal of                             Leuven, Belgium, edited by Milena Dobreva and Jan
   Information Science 1, no. 5: 267-76.                                                  Engelen. Leuven, Belgium: Peeters, 119-126. http://
Garfield, Eugene, Irving H. Sher and Richard J. Torpie.                                   eprints.rclis.org/bitstream/10760/7389/1/ELPUB_200
   1964. The Use of Citation Data in Writing the History of                               5-Marcondes.pdf
   Science. Philadelphia: The Institute for Scientific Infor-                          Marcondes, Carlos H., Luciana R. Malheiros and Leo-
   mation.                                                                                nardo C. da Costa. 2014. “A Semantic Model for Schol-
GenBank. http://www.ncbi.nlm.nih.gov/genbank/                                             arly Electronic Publishing in Biomedical Sciences.” Se-
GO – Gene Ontology. http://geneontology.org/                                              mantic Web Journal 5, no. 4: 313-34.
Greider, Carol W. and Elizabeth H. Blackburn. 1985.                                    Miller, David L. 1947. “Explanation versus Description.”
   “Identification of a Specific Telomere Terminal Trans-                                 Philosophical Review 56, no. 3: 306-12.
   ferase Activity in Tetrahymena Extracts.” Cell 43: 405-                             Murray-Rust, Peter and Henry S. Rzepa. 1999. “Chemical
   13.                                                                                    Markup, XML and the World Wide Web. I: Basic
Greider, Carol W. and Elizabeth H. Blackburn. 1987. “The                                  principles.” Journal of Chemical Information and Computer
   Telomere Terminal Transferase of Tetrahymena is a                                      Science 39, no. 6: 928-42.
   Ribonucleoprotein Enzyme with Two Kinds of Primer                                   Murray-Rust, Peter and Henry S. Rzepa. 2002. “STMML.
   Specificity.” Cell 51: 887-98.                                                         A Markup Language for Scientific, Technical and
Gross, Alan G. 1990. The Rhetoric of Science. Cambridge,                                  Medical Publishing.” Data Science Journal 1 no. 2: 128-93.
   Massachusetts; London: Harvard University Press.                                    Nwogu, Kevin Ngozi. 1997. “The Medical Research
Hao, Ling-Yang, Mary Armanios, Margaret A. Strong, Bak-                                   Paper: Structure and Functions.” English for Specific
   tiar Karim, David M. Feldser, David Huso and Carol W.                                  Purposes 16, no. 2: 119-38.
   Greider. 2005. “Short Telomeres, Even in the Presence                               Open Biomedical Ontologies (OBO) Foundry. http://
   of Telomerase, Limit Tissue Renewal Capacity.” Cell                                    www.obofoundry.org/
   123: 1121–31.                                                                       Ontology for Experiment Self-Publishing. http://www.
                                                                                          w3.org/wiki/HCLS/ScientificPublishingTaskForce

                                                           https://doi.org/10.5771/0943-7444-2016-2-86
                                                    Generiert durch IP '46.4.80.155', am 25.05.2021, 20:26:24.
                                             Das Erstellen und Weitergeben von Kopien dieses PDFs ist nicht zulässig.
You can also read