|
INTRODUCTION Controlled vocabularies are crucial to almost all healthcare applications. Clinical systems collect data about patient care which require controlled terms for
mundane functions, such as billing or for more sophisticated ones, such as electronic medical records. General information systems need controlled vocabularies to index articles from journals, books, and proceedings so
that information retrieval is possible. Medical expert systems use controlled vocabularies to map patients' data to their knowledge sources in search for a solution to a patient care scenario. It is surprised that
such a critical area in medical informatics, natural language processing (NLP) and controlled vocabularies, has been neglected so far. Researchers just realize that advances in other areas, such as computer-based
patient record system, knowledge-based systems, cannot proceed farther without a standardized nomenclature and classification of medical terms. There exist several modern nomenclature and classification systems
nowadays, among them:
SNOMED: Systematized Nomenclature of Human and Veterinary Medicine UMLS
: Unified Medical Language System CPT
: Current Procedural Terminology ICD: International Classification of Diseases
In this article, these different nomenclature and classification systems are compared and contrasted along these following axes: history of their development, organizing principles, structure, current usage, and
usage requirements for clinical vocabularies. SNOMED SNOMED is the Systematized Nomenclature of Human and Veterinary Medicine. SNOMED International was introduced
in September 1993 and is traceable to its roots in the early 1960s as the Systematized Nomenclature for Pathology. SNOMED International is a comprehensive, multiaxial nomenclature classification work created for the
indexing of the entire medical record, including signs and symptoms, diagnoses, and procedures. Its unique design will allow full integration of all medical information in the electronic medical record into a single
data structure. The most recent version of SNOMED International (ver. 3.4) contains more than 150,000 terms and term codes in 11 separate modules. The SNOMED International classification system contains 11 separate
modules, listed below. There are more than 150,000 terms and termcodes included in the system; the numbers in parentheses below indicate the number of records contained within each module.
Topography |
A functional anatomy for human and veterinary medicine. |
(12,803 records) |
Morphology |
Terms used to name and describe structural changes in disease and abnormal development |
(5,672 records) |
Function |
Terms used to describe the physiology and pathophysiology of disease processes |
(18,027 records) |
Living Organisms |
Living organisms of etiological significance in human and animal disease |
(24,480 records) |
Chemicals, Drugs, and Biological Products |
Including pharmaceutical manufacturers. |
(14,275 records) |
Physical Agents, Activities, and Forces |
A compilation of physical activities, physical hazards, and the forces of nature. |
(1,410 records) |
Occupations |
Developed by, and used with permission from, the International Labour Office in Geneva, Switzerland |
(1,947 records) |
Social Context |
Social conditions and relationships of importance to medicine. |
(845 records) |
Diseases/Diagnoses |
A classification of the recognized clinical conditions encountered in human and veterinary medicine |
(34,377 records) |
Procedures |
A classification of health care procedures |
(28,685 records) |
General Linkages/Modifiers |
Linkages, descriptors, and qualifiers to link or modify terms from each module |
(1,373 records) |
SNOMED International is rapidly being accepted worldwide as the standard for indexing medical record information. The American Veterinary Medical Association and the American
Dental Association have recognized SNOMED's virtues and have adopted/endorsed SNOMED for their use. In addition, SNOMED is specified as the controlled terminology and message
standard for interchange of biomedical images and image-related information in the DICOM (Digital Imaging and Communications in Medicine) standards.
An example of SNOMED as a nomenclature and as a classification system is shown below:
Nomenclature |
Classification |
Topography + |
Morphology + |
Etiology + |
Function = |
Disease |
Crystalline lens + |
Cataract, Mature + |
Acquired + |
Low vision = |
Disease of lens |
T-XX700 |
M-51120 |
E-0024 |
F-X0050 = |
D-X080 |
UMLS In 1986, the National Library of Medicine (NLM) began a long-term research and development
project to build the Unified Medical Language System (UMLS®). The purpose of the UMLS is to aid the development of systems that help health professionals and researchers retrieve and
integrate electronic biomedical information from a variety of sources. The UMLS approach involves the development of machine-readable Knowledge Sources that can be used by a wide
variety of applications programs to compensate for differences in the way concepts are expressed in different machine-readable sources and by different users, to identify the information
sources most relevant to a user inquiry, and to negotiate the telecommunications and search procedures necessary to retrieve information from these sources. The goal is to make it easy for
users to link disparate information systems, including computer-based patient records, bibliographic databases factual databases, and expert systems.
There are four UMLS Knowledge Sources: the Metathesaurus®:, the SPECIALISTtm Lexicon, a Semantic Network and an Information Sources Map. Most heavily used to date, the
Metathesaurus provides a uniform, integrated distribution format for more than 30 biomedical vocabularies and classifications, linking many different names for the same concepts. The Lexicon
contains syntactic information for many Metathesaurus terms, component words, and English words, including verbs that do not appear in the Metathesaurus. The Semantic Network contains
information about the types or categories (e.g., "Disease or Syndrome," "Virus") to which all Metathesaurus concepts have been assigned and the permissible relationships among these types
(e.g., "Virus" causes"Disease or Syndrome"). The Information Sources Map or directory contains both human-readable and machine-"processable" information about the scope, location,
vocabulary, syntax rules, and access conditions of biomedical databases of all kinds The UMLS Knowledge Sources were designed as multi-purpose tools, to facilitate the
development of more effective biomedical information systems. As intended, they have been applied in a wide variety of research and development environments to many different tasks,
including vocabulary development, knowledge representation, clinical data capture, linking patient data to knowledge sources, curriculum analysis, natural language processing, automated indexing,
and information retrieval. Particularly in its early years, but also more recently, the UMLS project commissioned exploratory and ancillary studies on such topics as user information needs, methods of organizing
and merging vocabulary information, and information retrieval techniques and also developed specialized tools for use in the research effort. CPT-4
Physicians' Current Procedural Terminology, 4th Edition (CPT-4) is a listing of descriptive terms and identifying codes for reporting medical services and procedures performed by physicians.
The purpose of the terminology is to provide a uniform language that will accurately describe medical, surgical, and diagnostic services, and will thereby provide an effective means for reliable
nationwide communication among physicians, patients, and third parties. CPT first appeared in 1966. Each procedure or service is identified with a five digit code. The main body of the material is
listed in six sections. Within each section are subsections with anatomic, procedural, condition, or descriptor subheadings. The procedures and services with their identifying codes are presented in
numeric order with one exception-the entire Evaluation and Management section (99201-99499) has been placed at the beginning of the listed procedures.
A physician using CPT terminology and coding selects the name of the procedure or service that most accurately identifies the service performed. The physician then may list other additional
procedures performed or pertinent special services. When necessary, he lists any modifying or extenuating circumstances. Any service or procedure should be adequately documented in the
medical record. Any procedure or service in any section of the CPT book may be used to designate the services rendered by any qualified physician.
Specific "Guidelines" are presented at the beginning of each of the six sections. These Guidelines define items that are necessary to appropriately interpret and report the procedures and services
contained in that section. The star "*" is used to identify certain surgical procedures that the usual "package" concept for
surgical services cannot be applied. Such procedures are identified by a star (*) following the procedure code number. A modifier provides the means by which the reporting physician can indicate that a service or
procedure that has been performed has been altered by specific circumstances but not changed in its definition or code. ICD-9-CM
ICD-9-CM stands for International Classification of Diseases, 9th Revision, Clinical Modification, published under different names since 1900. ICD-9-CM is a statistical
classification system that arranges diseases and injuries into groups according to established criteria. Most ICD-9-CM codes are numeric and consist of three, four or five numbers and a
description. The codes are revised approximately every 10 years by the World Health Organization and annual updates are published by HCFA. ICD-9-CM is based on the official
version of the World Health Organization (WHO), 9th Revision, International Classification of Diseases (ICD-9). ICD-9-CM was originally published as a three volume set (2nd
edition). Newer versions of ICD-9-CM are available as two separate books (volume 1 and Volume 2) and as a single book containing Volume 1 and Volume 2, or Volumes 1, 2, and 3 depending on the publisher.
The Tabular List (volume 1) is a numeric listing of diagnosis codes and descriptions consisting of 17 chapters that classify diseases and injuries, two sections containing supplementary codes (V
codes and E codes) and six appendices. The Alphabetical Index (Volume 2) of ICD-9-CM consists of an alphabetic list of terms and
codes, two supplementary Sections following the alphabetic listing, plus three special tables found within the alphabetic listing.
The Procedures: Tabular and Alphabetic Index (Volume 3) consists of two sections of codes that define procedures instead of diagnoses. Frequently used incorrectly by health care professionals,
codes from Volume 3 are intended only for use by hospitals. The ICD-9-CM Procedure Classification is a modification of WHO's Fascicle V, Surgical Procedures, and is published as
Volume 3 of ICD-9-CM. It contains both a Tabular List and an Alphabetic Index. Approximately 90% of the rubrics refer to surgical procedures with the remaining 10%
accounting for other investigative therapeutic procedures. REQUIREMENTS FOR CLINICAL VOCABULARIES
Cinimo et al. (1989) have defined six attributes as criteria for building and evaluating clinical vocabularies. Evans et al. (1991) have defined three additional features essential for concepts in clinical vocabularies:
- Domain completeness-coverage of all possible terms that lie within a vocabulary's domain.
- Unambiguity-the same term cannot refer to more than one concept.
- Nonredundancy-each concept must be presented by one unique identifier.
- Synonymy-multiple ways of expressing a word or concept must be allowed.
- Multiple classification-concepts must be allowed to be classified in multiple hierarchies.
- Consistency of views-concepts must have the same relationship in all views.
- Explicit relationships-all relationships must be explicitly labeled.
- Lexical decomposition-each concept must be lexically decomposable so that different attributes can be assigned.
- Semantical typology-each concept allows for restriction of allowable modifiers and grounds for synonyms.
- Extensible composition-certified terms can be allowed to generate new concepts.
CONCLUSION All current nomenclature and classification systems do not meet all criteria of clinical vocabularies
as proposed by Cimino et al. and Evans et al. CPT-4 and ICD-9-CM are most of the time used for financing purposes while SNOMED and UMLS promise a brighter use in clinical applications.
|