How do you ensure the best validity of your data?

How do you ensure the best validity of your data?




What is Data Collection?

Data collection is the procedure of collecting, measuring, and analyzing accurate insights for research using standard validated techniques.

Put simply, data collection is the process of gathering information for a specific purpose. It can be used to answer research questions, make informed business decisions, or improve products and services.

To collect data, we must first identify what information we need and how we will collect it. We can also evaluate a hypothesis based on collected data. In most cases, data collection is the primary and most important step for research. The approach to data collection is different for different fields of study, depending on the required information.

Validity is an evaluation criterion used to determine how important the empirical evidence and theoretical foundations that support an instrument, examination, or action taken are.  Also, it is understood as the degree to which an instrument measures what it purports to measure or that it meets the objective for which it was constructed. This criterion is essential to consider a test valid. Validity along with reliability determine the quality of an instrument.

Currently, this has become a relevant element within the measurement due to the increase in new instruments used at crucial moments, for example when selecting new personnel or when determining the approval or disapproval of an academic degree. Likewise, there are who point out the need to validate the content of existing instruments.

The validation process is dynamic and continuous and becomes more relevant as it is further explored. The  American Psychological Association  (APA), in 1954, identified 4 types of validity: content, predictive, concurrent and construct.  However, other authors classify it into appearance, content, criterion and construct validity.

Content validity is defined as the logical judgment about the correspondence that exists between the trait or characteristic of the student’s learning and what is included in the test or exam. It aims to determine whether the proposed items or questions reflect the content domain (knowledge, skills or abilities) that you wish to measure.

To do this, evidence must be gathered about the quality and technical relevance of the  test ; It is essential that it is representative of the content through a valid source, such as: literature, relevant population or expert opinion. The above ensures that the test includes only what it must contain in its entirety, that is, the relevance of the instrument.


This type of validity can consider internal and external criteria. Among the internal validity criteria are the quality of the content, curricular importance, content coverage, cognitive complexity, linguistic adequacy, complementary skills and the value or weighting that will be given to each item. Among the external validity criteria are: equity, transfer and generalization, comparability and sensitivity of instruction; These have an impact on both students and teachers.

The objective of this review is to know the methodologies involved in the content validity process. This need arises from the decision to opt for a multiple-choice written exam, which measures knowledge and cognitive skills, as a modality to obtain the professional title of nurse or nurse midwife in a health school at a Chilean university. This process began in 2003 with the development of questions and their psychometric analysis; however, it is considered essential to determine the content validity of the instrument used.

To achieve this objective, a search was carried out in different databases of the electronic collection, available in the University’s multi-search system, using the key words:  content validity, validation by experts, think-aloud protocol/ spoken thought . For the selection of publications, the inclusion criteria used were: articles published from 2002 onwards; full text, without language restriction, it should be noted that bibliography of classic authors on the subject was incorporated. 58 articles were found, of which 40 were selected.

The information found was organized around the 2 most used methodologies to validate content: expert committee and cognitive interview.

Content validity type

There are various methodologies that allow determining the content validity of a  test  or instrument, some authors propose that among them are the results of the  test , the opinion of the students, cognitive interviews and evaluation by experts; others perform statistical analyzes with various mathematical formulas, for example, they use factor formulas with structural equations,  these are less common.

In cognitive interviews, qualitative data is obtained that can be delved into; unlike expert evaluation that seeks to determine the skill that the exam questions are intended to measure. Some experts point out that to validate the content of an instrument, the following are essential: review of research, critical incidents, direct observation of the applied instrument, expert judgment and instructional objectives. The methods frequently mentioned in the reviewed articles are the expert committee and the cognitive interview.

Expert Committee

It is a methodology that allows determining the validity of the instrument through a panel of expert judges for each of the curricular areas to be considered in the evaluation instrument, who must analyze – at a minimum – the coherence of the items with the objectives of the courses, the complexity of the items and the cognitive ability to be evaluated. Judges must have training in question classification techniques for content validity. This methodology is the most used to perform content validation.

It is therefore essential that before carrying out this validation, two problems are resolved: first, determine what can be measured and second, determine who will be the experts who will validate the instrument. For the first, it is essential that the author does an exhaustive bibliographic review on the topic, he can also work with focus groups; This period is defined by some authors as a stage of development.

Expert Committee

For the second, although there is no consensus that defines the characteristics of an expert, it is essential that he or she knows about the area to be investigated, whether at an academic and/or professional level, and that, in turn, he or she knows about complementary areas. However, other authors are more emphatic when defining who is an expert and consider it a requirement, for example, that they have at least 5 years of experience in the area. All this requires that the sample be intentional.

The characteristics of the expert must be defined and, at the same time, the number of them determined. Delgado and others point out that there should be at least 3, while  García  and  Fernández , when applying statistical variables, concluded that the ideal number varies between 15 and 25 experts;  However,  Varela  and others point out that the number will depend on the objectives of the study, with a range between 7 and 30 experts.

There are other less strict authors when determining the number of experts; they consider the existence of various factors, such as: geographical area or work activity, among others. Furthermore, they point out that it is essential to anticipate the number of experts who will not be able to participate or who will defect during the process.

Once it is decided what the criteria will be to select the experts, they are invited to participate in the project; During the same period, a classification matrix is ​​prepared, with which each judge will determine the degree of validity of the questions.

For the process of preparing the matrix, the Likert scale of 3, 4 or 5 points is used where the evaluation of the possible answers could be classified into different types, for example: a) excellent, good, average and bad; b) essential; useful; useful, but not essential or necessary. The above depends on the type of matrix and the specific objectives pursued.

Furthermore, other studies mention having incorporated spaces where the expert can provide their contributions and appreciations regarding each question. Subsequently, each expert is given – via email or in person in an office provided by the researcher – the classification matrix and the instrument to be evaluated.

Once the results of the experts are obtained, the data is analyzed; The most common way is to measure the agreement of the evaluation of the item under review, reported by each of the experts, it is considered acceptable when it exceeds 80%; those that do not reach this percentage can be modified and subjected to a new validation process or simply be eliminated from the instrument.

Other authors report using Lashe’s (1975) statistical test to determine the degree of agreement between the judges; they observe a content validity ratio with values ​​between -1 and +1. When the value is positive it indicates that more than half of the judges agree; On the contrary, if this is negative, it means that less than half of the experts are. Once the values ​​are obtained, the questions or items are modified or eliminated.

To determine content validity using experts, the following phases are proposed: a) define the universe of admissible observations; b) determine who are the experts in the universe; c) present – ​​by the experts – the judgment through a concrete and structured procedure on the validity of the content and d) prepare a document that summarizes the data previously collected.

The literature describes other methodologies that can be used together or individually. Among them are:

– Fehring Model: aims to explore whether the instrument measures the concept it wants to measure with the opinion of a group of experts; It is used in the field of nursing, by the American Nursing Diagnostic Association (NANDA), to analyze
the validity of interventions and results. The method consists of the following phases:

a) Experts are selected, who determine the relevance and relevance of the topic and the areas to be evaluated using a Likert scale.

b) The scores assigned by the judges and the proportion of these in each of the categories of the scale are determined, thereby obtaining the content validity index (CVI); This index is achieved by adding each of the indicators provided by the experts in each of the items, and, finally, it is divided by the total number of experts. Each of these particular indices are averaged, those whose average does not exceed 0.8 are discarded.

c) The format of the text is definitively edited, taking into account the CVI value, according to the aforementioned parameter, those items that will make up the final instrument and those that, due to their low CVI value, are considered critical and must be reviewed are determined. .

An example of a specific use of this model was the adaptation carried out by  Fehring  to carry out the content validity of nursing diagnoses; In this case, the author proposes 7 characteristics that an expert must meet, which are associated with a score according to their importance. It is expected to obtain at least 5 of them to be selected as an expert.

The maximum score is obtained by the degree of Doctor of Nursing (4 points) and one of the criteria for the minimum scores (1 point) is having one year of clinical practice in the area of ​​interest; It is important to clarify that the authors recognize the difficulty that exists in some countries due to the lack of expertise of professionals.

– Q Methodology: it was introduced by  Thompson  and  Stephenson  in 1935, in order to identify in a qualitative-quantitative way common patterns of opinion of experts regarding a situation or topic. The methodology is carried out through the Q ordering system, which is divided into stages: the first brings together the experts as advised by  Waltz  (between 25 and 70), who select and order the questions according to their points of view. on the topic under study, in addition, bibliographic evidence is provided as support.

The second phase consists of collecting this information, by each of the experts, according to relevance, which goes along a continuum, from “strongly agree” to “strongly disagree”; Finally, statistical analyzes are carried out to determine the similarity of all the information and the dimensions of the phenomenon. 30

– Delphi Method: allows obtaining the opinion of a panel of experts; It is used when there is little empirical evidence, the data are diffuse or subjective factors predominate. It allows experts to express themselves freely since opinions are confidential; At the same time, it avoids problems such as poor representation and the dominance of some people over others.

During the process, 2 groups participate, one of them prepares the questions and designs exercises, called the monitor group, and the second, made up of experts, analyzes them. The monitoring group takes on a fundamental role since it must manage the objectives of the study and, in addition, meet a series of requirements, such as: fully knowing the Delphi methodology, being an academic researcher on the topic to be studied and having skills for interpersonal relationships.

The rounds happen in complete anonymity, the experts give their opinion and debate the opinions of other peers, make their comments and reanalyze their own ideas with the feedback of the other participants. Finally, the monitoring group generates a report that summarizes the analysis of each of the responses and strategies provided by the experts. It is essential that the number of rounds be limited due to the risk of abandonment of the process by the experts.

The latter is the most used due to its high degree of reliability, flexibility, dynamism and validity (content and others); Among its attributes, the following stand out: the anonymity of the participants, the heterogeneity of the experts, the interaction and prolonged feedback between the participants, this last attribute is an advantage that is not present in the other methods. Furthermore, there is evidence that indicates that it is a contribution to the security of the decision made, since this responsibility is shared by all participants.


Table of Contents