A Reliable Method to Assess Keel Bone Fractures in Laying Hens From Radiographs Using a Tagged Visual Analogue Scale
Up to 97 % of laying hens housed in aviary systems are affected by keel bone fractures. Due to the scope of the problem, multiple efforts investigating causes and consequences of fractures have been conducted. The most frequently used techniques to detect fractures lack accuracy and provide only vague information (palpation) or cannot be conducted longitudinally (dissection). Radiographic imaging overcomes these weaknesses as it allows longitudinal observations and provides detailed information for individual fractures of which a single keel may have several at different locations and of different origins. However, no standardized system exists to assess fracture severity from radiographs if multiple fractures are present. The aim of this study was therefore to test the reliability of a scoring system assessing the aggregate severity of multiple fractures, taking into account the characteristics of all present fractures (e.g. locations, callus formation, width of fracture gaps). We developed a scoring system based on a tagged visual analogue scale, ranging from score 0 (no fracture) to score 5 (extremely severe) with intermediate tags for scores 1, 2, 3, and 4. A catalogue of example scores was provided to describe the range of each score visually. An online tutorial with an introduction, training and scoring session was completed by 14 participants with varying experience involving laying hens and keel bone damage. For inter-observer reliability, we found an Intraclass correlation coefficient (ICC) of 0.985 with a 95 % confidence interval of 0.974 < ICC < 0.993 (average-rating, absolute-agreement, two-way random-effects model). Intraclass correlation coefficient for intra-observer reliability was 0.923 with a 95 % confidence interval of 0.879 < ICC < 0.951 (single-rating, absolute-agreement, two-way mixed-effects model). Intra-observer reliability ranged from 0.704 to 1.0 indicating excellent agreement and similar ratings across and within participants. Further, high ICCs suggest that the introduction and the training sessions provided were adequate tools to prepare observers for the assessment task despite various backgrounds of the participants. Nonetheless, the validity of this scoring system needs to be investigated further in order to link responses of interest and biological relevance with the specific severity values resulting from our scoring system.