Discussion

Data Quality Index - Section: Overall Questions

IATI Secretariat • 1 September 2021
 Data Quality Index Consultation - sub-section Overall Questions

Instructions for submitting your feedback
  1. Feel free to share your overall feedback on the Data Quality Index Consultation through the comment-box below.
  2. Consider the guiding questions as outlined.

Guiding questions:

  • If you can use five words, what would you describe as good quality IATI data?
  • Do you have suggestions on ways of  using the Data Quality Index to incentivise better quality publishing and ensure increased attention by publishers on the transparency of aid data?
  • Have we missed any important measures in the DQI? If so, what is your additional proposal?

  • Should an overall summary index measure be added? If so, what weight should be given to each measure? Or should  each measure be left as a separate assessment?


BACK TO MAIN DQI-PAGE
 

Files

Comments (4)

Herman van Loon
  • If you can use five words, what would you describe as good quality IATI data?

    Data successfully used by others.
     
  • Do you have suggestions on ways of  using the Data Quality Index to incentivise better quality publishing and ensure increased attention by publishers on the transparency of aid data?

    Publish the DQI in newsletters and other IATI publications.
     
  • Have we missed any important measures in the DQI? If so, what is your additional proposal?

    Regarding the availability of data it is not enough if data have been published. Equally import is to assess that the data published is valid: e.g only existing codes and references to existing identifiers are being made. 
     

  • Should an overall summary index measure be added? If so, what weight should be given to each measure? Or should  each measure be left as a separate assessment?

    Yes, with emphasis on publishing valid usable data. Weight should be given to elements relevant for network transparency and completeness of data taking into account the role a publisher has in the network.

leo stolk
  • If you can use five words, what would you describe as good quality IATI data?

Data useful and used by others and yourself

  • Do you have suggestions on ways of  using the Data Quality Index to incentivise better quality publishing and ensure increased attention by publishers on the transparency of aid data?

make DQI a dashboard,
recognition for degrees of quality in various categories. 
allow publishers to use this recognition in their comms in agreed way, with IATI logo.
 

  • Have we missed any important measures in the DQI? If so, what is your additional proposal?

The length of trails and the last mile matter in terms of quality. Assessment of effort in helping next peer publishers 'down the trail' could be a missing measure. How many recipient organisations publish in IATI ? 

  • Should an overall summary index measure be added? If so, what weight should be given to each measure? Or should  each measure be left as a separate assessment?

Suggest to calculate an average of the measure rankings, without too much emphasis. the individual scores per measure are most important.

matmaxgeds

Hi - my experience has been that what is 'good quality data' varies so much based on what you are trying to use it for, that you probably need a different index/measure for each use case e.g. if I want to know 'how much aid went to Mali in 2020' then there are many fields I am not bothered about, but comprehensiveness, use of the org_ids needed to avoid double counting, not deleting completed activities etc are very important. That is completely different for other use cases which would priorities other fields. I think we need publishers to say "I seek to meet use case X' so we can say....ok, then look at index Y which specifically targets that use case.

Measure I would consider adding:

  1. RE document links - are the embedded links/contacts functioning e.g. some donors have a url for the project website that is just a link to the search function on their main website - this should not count. Links also need testing to see if they 404 etc.
  2. Whether old activities are published or are removed - and if the schedule for removal is shared e.g. in the org file (part of comprehensiveness e.g. if completed activities are removed in-year, you cannot make annual stats)
  3. Whether the org file indicates if the data is considered official by the publisher
  4. Whether the license allows commercial re-use
  5. RE sub-national location - I suggest removing a) the capital and b) the geographic center of the country or you will get too many false positives
  6. For coverage, the current measure will not capture where spend has been removed from the total expenditure element e.g. by not including there, spend from activities that are also not being reported - e.g. if a publisher org is only publishing a subset of their activities
  7. It might be worth a basic one on date failures e.g. start date after end date
  8. One on 'duplicate activities' as these do not consistently appear in datastores
  9. One on failures where the activity-status does not agree with the dates e.g. end date passed, marked as in implementation

Great to see the validator will be used: Is there somewhere in the SSOT etc that explains what is a 'warning' vs an 'error' etc and why, so that publishers would be able to understand the feedback - otherwise these terms are rather random.

Yohanna Loucheur

Would agree with several of Mat's suggestions, especially regarding the basic data failure checks, and also the complexity of defining data quality - it does really depend on the use case one has in mind. It's almost like we need a big interactive  dashboard to cover them adequately:

- press "Use Case A", a sub-set of measures adds up to provide a publisher's score in meeting it

- press "Use Case B", another sub-set (which includes the same mandatory elements but also some different measures) adds up and provides the relevant score (which may be completely different!);

- press "Humanitarian" or "Grand Bargain Use Case", the Grand Bargain elements add up to calculate the score. 

Granted, it would be hard to come up with a single, overall score for each publisher, but it would provide more useful guidance to data user.

 


Please log in or sign up to comment.