gms | German Medical Science

GMS Medizinische Informatik, Biometrie und Epidemiologie

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDS)

ISSN 1860-9171

Validation of the TeleForm scan workflow in the GNC health study on the example of the questionnaire on physical activity

Validierung des TeleForm Scan-Workflows in der NAKO Gesundheitsstudie am Beispiel des Fragebogens zur körperlichen Aktivität

Research Article

  • corresponding author Katja Uekoetter - Hamm-Lippstadt University of Applied Sciences, Hamm, Germany
  • Nina Ebert - German Diabetes Centre, Heinrich Heine University, Dusseldorf, Germany
  • Alexandra Stoffels - IUF – Leibniz Research Institute for Environmental Medicine, Dusseldorf, Germany
  • Claudia Wigmann - IUF – Leibniz Research Institute for Environmental Medicine, Dusseldorf, Germany
  • Tamara Schikowski - IUF – Leibniz Research Institute for Environmental Medicine, Dusseldorf, Germany

GMS Med Inform Biom Epidemiol 2021;17(1):Doc03

doi: 10.3205/mibe000217, urn:nbn:de:0183-mibe0002173

Published: May 31, 2021

© 2021 Uekoetter et al.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Abstract

Electronic data capture (EDC) is an important tool for the digitalisation of paper-based documents such as questionnaires and for the identification of errors before values are finally saved in a database. The data acquisition software TeleForm is one example for an EDC system which is used to digitise paper-based documents. TeleForm checks the data of the scanned document and gives indications of possibly incorrectly read data. In the German National Cohort (GNC) this software is among other things applied to digitalise questionnaires.

The following questions are addressed in this article: Is the scan workflow referring to the questionnaires in the GNC and in particular the data acquisition software TeleForm (with the settings chosen for the GNC) reliable? How much loss of data quality is acceptable to reduce the amount of work? Can artificial intelligence replace human inspection sufficiently or will the latter continue to play an indispensable role in the scan workflow of the GNC in the future? By answering these questions, the strengths and the limitations of the scan workflow in the GNC using the TeleForm software will be discussed.

The current work uses data collected in the GNC centre in Dusseldorf. 300 questionnaires on physical activity were validated and checked twice, first by the system TeleForm and second by a visual assessment.

The data acquisition software TeleForm shows high error rates in interpreting free text fields as well as in reading handwritten numbers. Especially the digit “0” was misinterpreted most often.

In order to save time and thus make work easier, some shortcomings must be remedied. This can be achieved, for example, by putting special emphasis on the expansion of the reading areas of TeleForm and on the improved reproduction and reading of numerical values.

Keywords: GNC, TeleForm, validation, questionnaire, scan workflow

Zusammenfassung

Die elektronische Datenerfassung (EDC) ist ein wichtiges Instrument zur Digitalisierung von papierbasierten Dokumenten wie beispielsweise Fragebögen. Ebenso ist es für die Identifizierung von Fehlern hilfreich, bevor die Werte endgültig in einer Datenbank gespeichert werden. Die Datenerfassungssoftware TeleForm ist ein Beispiel für ein EDC-System, das zur Digitalisierung von papierbasierten Dokumenten eingesetzt wird. TeleForm prüft die Daten des eingescannten Dokumentes und gibt Hinweise auf möglicherweise fehlerhaft gelesene Daten. In der NAKO Gesundheitsstudie wird diese Software unter anderem zur Digitalisierung von Fragebögen eingesetzt.

In diesem Artikel werden die folgenden Fragen behandelt: Ist der Scan-Workflow bezogen auf die Fragebögen in der NAKO Gesundheitsstudie und insbesondere die Datenerfassungssoftware TeleForm (mit den für die NAKO gewählten Einstellungen) zuverlässig? Wieviel Verlust an Datenqualität ist akzeptabel, um den Arbeitsaufwand zu reduzieren? Kann künstliche Intelligenz die menschliche Überprüfung ausreichend ersetzen oder wird letztere auch in Zukunft eine unverzichtbare Rolle im Scan-Workflow der NAKO spielen? Durch die Beantwortung dieser Fragen sollen die Stärken und Grenzen des Scan-Workflows in der NAKO Gesundheitsstudie unter Verwendung der TeleForm-Software diskutiert werden.

Die aktuelle Arbeit verwendet Daten, die im NAKO-Zentrum in Düsseldorf erhoben wurden. 300 Fragebögen zur körperlichen Aktivität wurden validiert und zweimal überprüft, zum einen durch das System TeleForm und zum anderen durch eine visuelle Kontrolle.

Die Datenerfassungssoftware TeleForm zeigt hohe Fehlerquoten bei der Interpretation von Freitextfeldern sowie beim Lesen von handgeschriebenen Zahlen. Insbesondere die Ziffer „0“ wurde am häufigsten falsch interpretiert.

Um Zeit zu sparen und damit die Arbeit zu erleichtern, müssen einige Defizite behoben werden. Dies kann zum Beispiel durch die Erweiterung der Lesebereiche sowie die Verbesserung des Lesens von Zahlenwerten erzielt werden.

Schlüsselwörter: NAKO Gesundheitsstudie, TeleForm, Validierung, Fragebogen, Scan-Workflow


1 Introduction

The availability of digital data is becoming more and more important in a world of steady progress. In the course of increasing digitalisation, it is important to correctly record and reproduce data that was previously stored on paper [1]. For this reason, the so-called electronic data capture (EDC) is beneficial, for example, to ensure the data accessibility across multiple centres of an epidemiological study as well as for various health settings across health regions [2] by capturing documents manually. In the nationwide conducted German National Cohort (GNC) health study the questionnaires are submitted to a workflow via a data acquisition software for digital data collection called TeleForm [3]. Paper documents are scanned with the help of a document scanner, hence the raw data are stored digitally instead of on paper. TeleForm thus makes it possible to create and process paper-based and electronic documents. The focus of this software is the facilitation of work as well as saving time [2], [3], [4]. Moreover, main emphasis is put on the extraction of raw data for statistical evaluation and the transfer of data to downstream systems. There are fixed reading ranges from which the system extracts the values or terms mentioned. These are checkboxes, barcodes, matrices as well as handwriting and typewriting [5]. The TeleForm software’s advantages are obvious – fast and easy data acquisition and thus a work facilitation. Real disadvantages could not be discovered. The aim of this article is to test the validity of the GNC scan workflow, in particular with respect to the TeleForm software.


2 Objectives

The objective is to assess the reliability of the system. This means identifying if the reproductions are trustworthy and if the highlight set by TeleForm, which indicates a mistake or an uncertainty about a read answer, is sufficient for the questionnaire’s digitalisation. We want to examine whether an additional inspection by the human eye, which will be called “Eye Check“ throughout this article, remains indispensable due to the error susceptibilities of the acquisition software determined within the scope of this work. The manual transfer is disregarded. Therefore, the presented study aims to answer these questions:

Does the system consequently provide a work facilitation or does it cause an increase in the amount of work involved, as errors still occur despite the highlighting, which can only be detected by an exact check? As a result, is it possible to derive potential improvements from the collected data for the settings used with the TeleForm system, which ultimately present the “Eye Check” as redundant, or will human inspection continue to play an essential role in the scan workflow of the GNC health study and possibly in other contexts in the future?


3 Methods

3.1 The GNC health study and how TeleForm is used

The health development of the German population as well as the different environmental factors and living conditions are increasingly becoming the focal point of current research. A community of various universities, including Heidelberg, Greifswald, Regensburg, Freiburg and Munster, institutes and scientists developed the idea of a German national long-term study with 200,000 participants, the German National Cohort [6].

In the course of their visit at the study centre, participants of the GNC have to fill out several questionnaires, one of which is asking questions about physical activity during the last twelve months. Some are already collected electronically, but others are still collected on paper in order to process a larger throughput of participants on site. This way, any existing digital equipment can be used while the remaining test persons fill out the paper questionnaires.

After the participants have completed the questionnaires at the study centre, they are processed further at the IUF – Leibniz Research Institute for Environmental Medicine in Dusseldorf. The data acquisition software TeleForm Web Capture is installed in a local web browser, via which the pseudonymised scans are sent to the data integration centre (cp. Figure 1 [Fig. 1]). There the questionnaires are evaluated with regard to the subsequent correction in the examination centre. The evaluation status of the respective questionnaire can be read from the Citrix terminal server display interface, where the actual verification also takes place. The complete validity check is proceeded in the module TeleForm Verifier [7], which is in charge of the precise verification of the collected data by performing control checks and highlighting any possibly incorrect information [5], [8]. Due to the numerous locations of the study centres throughout Germany, smooth communication between them is the key factor to success. TeleForm connects all recorded information with all centres using the Enterprise Content Management System (ECMS) [3]. The ECMS offers the possibility to manage the immense amount of information and data by making them visible online in TeleForm Web Capture in a structured way, accessible to all staff members involved (cp. Figure 1 [Fig. 1]). The GNC uses TeleForm to scan questionnaire responses and the integrated highlighting function to check for uncertainty or an error in the identified response.

3.2 Validation of the scan workflow and statistical analysis

For this article 300 questionnaires on physical activity were validated. The number of included questionnaires was determined by pondering between statistical precision and practicability. The collected data represents only a partial sample confined to the study centre in Dusseldorf. The sub sample was selected randomly and included questionnaires which had been completed over four months. After the first control by the TeleForm software, the “Eye Check” of every single question with corresponding answer was executed by one person and took between one to five minutes, depending on how many false positive and false negative answers were given. It has to be noted that the settings of the TeleForm software were of course adapted with respect to the physical activity questionnaire and corresponding needs in the GNC. This includes the reading frame, the context check, whether numbers or letter values should be given and if it is multiple or single choice. That means all results obtained here and hence all conclusions do only apply to these settings and the questionnaires used in the GNC and not necessarily to questionnaires of many other studies which may also process data with the TeleForm software.

Below, the frequencies of errors as well as the most common error types within TeleForm are assessed. To present the error frequency, all 19,800 (300 questionnaires with 66 questions each – calculation of the mathematical product) observations are displayed in a two-by-two matrix (cp. Figure 2 [Fig. 2]). The number of possible answers was only taken into account for boxes to be filled in, checkbox questions were considered as one answer. On the basis of the determined values the following statistical quality criteria were calculated: sensitivity, specificity, positive predictive and negative predictive value. Finally, for each quality criterion a corresponding 95% confidence interval was generated.


4 Results

4.1 Frequency of errors

Figure 3 [Fig. 3] shows the absolute values of the four possible cases in all 19,800 observations made. The relative share of true negative observations is 93.68%. In contrast, the proportion of false negative results is 0.95%. Moreover, 2.27% false positive observations and 3.11% true positive observations were made. The calculated quality criteria amounts to 0.69 (95% CI: 0.66; 0.72) for the sensitivity, 0.98 (95% CI: 0.97; 0.98) for the specificity, 0.52 (95% CI: 0.49; 0.55) for the positive predictive value and 0.99 (95% CI: 0.98; 0.99) for the negative predictive value. The sensitivity and specificity are used to assess the reliability of the TeleForm system and represent its characteristics in reality, i.e. after the validation has been carried out. In such a diagnostic test either more emphasis can be placed on high sensitivity or on high specificity. With a high sensitivity, the answers marked as correctly positive would make up a significantly higher proportion of the unmarked but still incorrect answers. At high sensitivity, the system would therefore indicate the presence of actual errors whereas the proportion of undetected errors would be minimal. With a value of 0.69 only an average sensitivity of TeleForm is present so that it can be assumed that the specificity of TeleForm has a higher value. It is 0.98. Therefore, if the specificity is high, the answers marked as correct negative would have a much higher proportion of the false positive answers. Here the importance of the specificity should be pointed out as most values are in fact correctly interpreted. A low specificity would have an incising impact to the data managers effort spent in correcting.

The positive predicted value (PPV) only considers the cases where TeleForm sets a mark. A high PPV would be achieved if the number of correct positive answers was significantly higher than the number of false positive answers. Since this has not occurred but the proportion of the two cases mentioned differs by less than 1%, the PPV is only 58%. This indicates TeleForm virtually marks a response with a 50:50 chance whether it is right or wrong. To put it bluntly, TeleForm sets its marking arbitrarily. As a result, minimising false positives could lead to an increase in PPV. This could be achieved through the suggested improvements listed in the discussion. The negative predicted value (NPV) considers the cases not marked by TeleForm. The ratio between the false negative and true negative is 1:98, resulting in an NPV of 98%. This value illustrates the ability of TeleForm to recognise correct answers.

The two most desirable cases, the true negative case and the true positive case, occurred in 19,163 observations. This corresponds to 96.79% of the 100% to be achieved, which represents a very good rate and thus speaks in favor of TeleForm. The two objectionable cases, the false negative and the false positive case, were observed in a total of 637 answered questions, with the first one being much more serious than the second. In total 18,997 answers were correct and 803 had to be corrected. TeleForm highlighted 1,064 answers, whereas 18,736 answers were not highlighted. The 1,064 answers show how well TeleForm recognises its own mistakes. Without the human interaction 188 false answers would have been captured.

4.2 Error types

First it has to be mentioned that the German text seen on the questionnaires’ screenshots is not relevant for the understanding of the treated topic. Rather the answer fields are of exclusive importance.

Answers highlighted by TeleForm which contained in fact an error (“true positive”), were in 67.64% of the cases retraceable to a non-readable or crossed out value (cp. Figure 4C [Fig. 4]). For numerical values a comparison with the available digits (0–9) is carried out. If the probability of the numeric value falls below a certainty threshold, the system sets “~” [8] (cp. Figure 4B,C [Fig. 4]). This error occurred 416 times.

A further occurring error type is the capturing of handwritten text fields. (cp. Figure 5 [Fig. 5]).

Blurred numbers mainly caused false positive results, which lead to an error. The TeleForm system highlighted 407 items which it could not clearly decipher. The system performs a plausibility check by comparing the handwritten digit with all possible values (0–9). If the plausibility rule is infringed, the value is highlighted yellow for control [9] (cp. Figure 6C [Fig. 6]). That accounts for 90.60% of the false positive results. Questions with the highest error rates are those where a number had to be filled in and not, as probably assumed, handwritten words (cp. Table 1 [Tab. 1]).

Answers which were not highlighted in yellow by TeleForm, but were incorrect when checked visually (“false negative”), have occurred 188 times and most frequently shared the same error: although one or two checked items were crossed out, the system recorded all answers as checked, even though the question’s content should only be answered with one answer (Figure 7 [Fig. 7]). In total this observation was made 67 times which represents a share of 35.6%. When drawing information from checkboxes, TeleForm measures the degree to which the circle is filled. If it is minimally filled (>25%) TeleForm does not tick the answer, but if the circle is filled equal or more than 40%, the answer is ticked. These numbers are the system’s default setting and were adopted unmodified by the GNC, but the values can also be adjusted [8]. With regard to sensitivity and specificity, it is noticeable that the questions determined with a sensitivity of zero and a specificity of one are all checkbox questions.

Another occurring error is that the system reads the digits “1” or “7” when in fact the participant wanted to cross out the answer with a slash (Figure 8 [Fig. 8]).

As seen in Figure 9 [Fig. 9], some numbers were misread because of a too small reading frame. In the present example TeleForm reads a 7 instead of a 2.

The most frequent digits the TeleForm software did not interpret correctly are given in Table 2 [Tab. 2]. Mostly, digit 0 was read incorrectly with a proportion of 35.1% among all numeric values, followed by digit 5 and 1. Instead of 0 an 8 or a 6 was frequently read, while instead of 5 a 3 was regularly read. The digit 1 was commonly mistaken for a 7. At this point, it should be mentioned that the relative frequency in relation to the number of opportunities to make an error has to be taken into account. As stated before, the digit 0 was most frequently misinterpreted. However, it cannot be defined whether digit 0 was also the most frequently observed number in the whole course of this study. Consequently, this aspect should be considered when interpreting the results. The most common numeric errors are listed in Table 3 [Tab. 3]. All occurred numeric errors can be extracted from Attachment 1 [Attach. 1].

As indicated by the results, there are different error types. These are presented in Table 1 [Tab. 1], indicating their frequency within the 300 validated questionnaires.

Errors regarding checkboxes occurred 30 times which accounts for a share of 14.0% and handwritten text was misinterpreted 36 times which amounts to 16.8%. Numeric values were misinterpreted with an absolute frequency of 148 which leads to a relative frequency of 69.2%.


5 Discussion

As mentioned above, at the GNC study centre in Dusseldorf information is collected both digitally and by paper questionnaires. This is partly due to the fact that as many test persons as possible should be interviewed on site and partly because the participants receive questionnaires postally with the invitation to participate in the study. The completed forms are then brought to the site. The validation of the questionnaires is executed by one person who is responsible for the quality check. The Eye Check is not double-checked by a second person.

Based on the calculated absolute and relative error frequencies, it can be concluded that TeleForm is a useful software for capturing data from paper questionnaires, especially for the requirements of the GNC. This is emphasised by the results of the collected data. In addition, the aspects of work facilitation and time saving could be confirmed. Jorgensen and Karlsmose also acknowledge the latter aspect [2], [3], [4].

The TeleForm software has difficulties in interpreting free text fields, e.g., when participants wrote down their performed sports by hand (cp. Figure 5 ). This was noticed because of frequently misread answers. The same error type occured with the highest rate in the study of Jenkins and his colleagues [9]. Furthermore, also Quan et al. identified the text fields as the highest error source and emphasised these answers as most labour intensive for the data managers and in fact inefficient [2]. These kind of errors may be reduced by giving an advice to the participants to fill out the text fields just in capital letters. Generally, there was a problem with numbers that had to be filled in manually. This refers to both, the false negatives as well as the true positives. These constrained print fields are also handwritten and as stated before, illegible handwriting causes the most problems. Against this backdrop, Jorgensen and Karlsmose found out that numeric recognition generates a significantly higher error rate compared to a manual data entry [4]. Therefore, TeleForm offers both context verification as well as reliability level. In its alphanumeric stock, TeleForm assumes that the response can be either only numbers or only letters during the context verification. The reliability level, which is adjustable by the data manager, defines how trustworthy the given answer is. If it is too low, the answer is highlighted. Additionally, it is possible to improve the scanning with a higher resolution [8]. As a result, indispensable “false negative” cases can be reduced by setting the threshold low enough at the expense of “false positive” cases. At this point, it has to be noted again that the TeleForm software’s settings are adapted to the requests in the GNC.

Furthermore, there have been difficulties with regard to the reading area of TeleForm. TeleForm determines standard detection fields for the machine and handwritten entries. Hence, the numeric value or the text has to be in the reading range or otherwise TeleForm cannot read it or records a wrong entry (cp. Figure 9 [Fig. 9]). To avoid these error sources, there is the option to enlarge the reading areas in the TeleForm settings [9].

As pointed out in the results, the false negative case makes up a minority. Mostly, it was caused by checkboxes in multiple choice items (cp. Figure 7 [Fig. 7]). To rectify that wrong answers are captured in the database, TeleForm could highlight these type of questions by default so that the “Eye Check” needs to be done every time. However, this would cause an increase of work. Consequently, it has to be decided which of the alternatives is more desirable and also human checking is not perfect so that errors cannot be excluded.

True negative cases are identified most of the time. TeleForm has thus proven that it is able to recognise correct answers directly. This constitutes a strength as well as the highlighting of uncertainties concerning specific answers. Nevertheless, there are also some limitations. They can be seen in the size of the reading frames and the checkbox questions, for example. The system could be improved by an enlargement of reading areas and a higher sensitivity to crossed-out answers. This would serve to simplify work and to save time during the digital acquisition of questionnaires, how it prevails within the framework of the GNC health study. The false positive case only warns to visually check the highlighted answer again. Basically, this is a positive feature of TeleForm as it serves to check an uncertainty. However, the consideration of these answers requires additional work time for each questionnaire. The more serious false negative case was extremely rare with less than one percent (0.95%), which confirms the good adequacy of using TeleForm.

All in all, the reliability and adequacy of the TeleForm system can be confirmed by the 300 verified questionnaires. Still, they are restricted to the GNC study centre in Dusseldorf and the validated amount just represents a very small sample. Therefore, it cannot be stated at this point whether our results can be applied to other studies. Considering the verified 300 questionnaires, it is only possible to evaluate the system with a very small sample, but this number is sufficient to uncover weaknesses of TeleForm. As a result, it can be concluded that the human inspection will continue to play an essential role in the scan workflow of the GNC. A cross-check is recommended in order to avoid misread values in the database.

In future, the best solution would be the data collection exclusively by digital means in order to save time and to avoid a transmission with errors.


6 Conclusion

In summary, the data acquisition software TeleForm with the settings used here plays an important role for the scan workflow of questionnaires in the GNC. It supports reading in the manually completed questionnaires and thus provides a considerable contribution to facilitating work. The error frequency is already pleasantly low, but might still be reduced by further adjustment of the settings in TeleForm.


Notes

Competing interests

The authors declare that they have no competing interests or personal relationships that could have appeared to influence the work reported in this paper. The authors received no funding for this analysis.

Acknowledgements

Finally, a special thanks goes to all study participants of the GNC health study, especially to those taking part in Dusseldorf.


References

1.
Hardin JM, Woodby LL, Crawford MA, Windsor RA, Miller TM. Data collection in a multisite project: Teleform. Public Health Nurs. 2005 Jul-Aug;22(4):366-70. DOI: 10.1111/j.0737-1209.2005.220410.x External link
2.
Quan KH, Vigano A, Fainsinger RL. Evaluation of a data collection tool (TELEform) for palliative care research. J Palliat Med. 2003 Jun;6(3):401-8. DOI: 10.1089/109662103322144718 External link
3.
Electric Paper Informationssysteme GmbH. TeleForm – Die universelle Capturing-Plattform für papierbasierte und elektronische Dokumente. Available from: https://www.electricpaper.de/files/electric_paper/PDF/TeleForm%20Broschuere_Electric%20Paper%20Informationssysteme.pdf External link
4.
Jørgensen CK, Karlsmose B. Validation of automated forms processing. A comparison of Teleform with manual data entry. Comput Biol Med. 1998 Nov;28(6):659-67. DOI: 10.1016/s0010-4825(98)00038-9 External link
5.
Electric Paper Informationssysteme GmbH. TeleForm – Datenerfassung intelligent automatisieren. Available from: https://www.electricpaper.de/produkte/teleform.html External link
6.
Hoffmann W, Jöckel KH, Kaaks R, Wichmann HE. The National Cohort – A prospective epidemiologic study resource for health and disease research in Germany: Wissenschaftliches Konzept der NAKO. 2015. Available from: https://nako.de/wp-content/uploads/2015/07/Wissenschaftliches-Konzept-der-NAKO2.pdf External link
7.
Nolan MT, Mock V. Measuring Patient Outcome. California: Sage Publications; 2000.
8.
Nies MA, Hein L. Teleform: a blessing or burden? Public Health Nurs. 2000 Mar-Apr;17(2):143-5. DOI: 10.1046/j.1525-1446.2000.00143.x External link
9.
Universität Trier / Zentrum für Informations-, Medien- und Kommunikationstechnologie. Formulare entwerfen und automatisch erfassen mit TeleForm 9.1. [updated 2015 Jul 31].
10.
Jenkins TM, Wilson Boyce T, Akers R, Andringa J, Liu Y, Miller R, Powers C, Ralph Buncher C. Evaluation of a Teleform-based data collection system: a multi-center obesity research case study. Comput Biol Med. 2014 Jun;49:15-8. DOI: 10.1016/j.compbiomed.2014.03.002 External link