Thea Schepers

Thea Schepers posted in Data Use Community

Hi everyone! Detail question for you. I have noticed that the participants in our publishing course sometimes put spaces in the IDs of their activities. Is this ever a problem for data users out there?

Mark Brough
Mark Brough

Yes! It can be a problem. We had this issue with Datastore Classic. A lot of the AidData data (which has not been updated for many years) has all kinds of problems with it. One of those problems is that there's a lot of whitespace in lots of elements, for example:

The identifier should be "US-501c3-522318905-SG-112009103505", but because of the whitespace around it, the actual link to this on Datastore Classic is as follows:

https://datastore.codeforiati.org/api/1/access/activity.xml?iati-identi…

Sarah McDuff
Sarah McDuff

Hi Thea. I believe that this would more likely be an issue for tools processing/making the data available. I know that there were some discussions around this with the Datastore (previous version) and d-portal. Perhaps Amy Silcock could chime in here to elaborate.

Amy Silcock
Amy Silcock

Hi Thea, I agree with Mark's comments that it causes issues for IATI tools and external ones: d-portal, IATI Datastore, iati.cloud. You often can't use the standard notation to access a specific activity via an API/URL. There are work arounds but it ideally needs to be fixed by the publisher.

We've added a safeguard into the Registry which automatically strips out whitespace to IATI Org IDs. But don't have a way to do this for activity-ids.

Thea Schepers
Thea Schepers

Justin Senn here's one: the publisher GB-CHC-328206 has a few spaces in their IDs, like "GB-CHC-328206-GB CHC 328206/6" and "GB-CHC-328206-UK Aid Match 2015-2018"
I do agree that it shouldn't happen. Ideally the publishing tool strips those spaces or gives an error. Fixing it in a later stage is always a workaround in my opinion. We will mention it in the next round of webinars.

Thea Schepers
Thea Schepers

Sarah McDuff I know of people who simply use the XML files, so skip the tools altogether. Much easier too, in my experience, in some situations (small scale analysis mostly)

Herman van Loon
Herman van Loon

Whitespace causes all kinds of problems when using data. Imo they should be avoided for all NEW activities. For existing activities, there is a problem to change the identifier though. It will break traceability among publishers.

Thea Schepers
Thea Schepers

Steven Flower oh my, i see some things come back again and again... That thread got picked up 2 years ago as well!
Herman van Loon If you catch them early enough, and/or they don't have implementing partners who publish, it is probably relatively harmless. But we will ask our participants to avoid spaces, dashes etc in their IDs.

Mark Brough
Mark Brough

Just one thing I would add on this (re discouraging the use of other characters) -- I think in that thread we also talk about the importance of publishers using the project code that's actually in their systems. For example, for the European Commission's International Partnerships, the codes are often of the form 2022/123-456 . I think it's important that they continue to use this code, as it also means that others can easily refer to the same code, to support traceability, as Herman says.

Thea Schepers
Thea Schepers

Agreed. Although I wonder if those slashes never lead to issues in their own systems, exports, interfaces and such, but that's a whole different story altogether.

leo stolk
leo stolk

another example of orgID with characters is pacja, KE-NCB-OP.218/051/2009/0496/G065
Guess they also participated in your trainings. We got a warning back from MFA, so I removed the '.' and '/' in our admin. No traceability as a result.
Informed PACJA to improve their Org ID on registry..

Thea Schepers
Thea Schepers

PACJA is fairly new so that is still a good solution in my opinion. I don't see that activity in their data now so they may have found a very drastic solution... I'll contact them.


Please log in or sign up to comment.