The IATI Technical Team has released v1.5.0 of Datastore Services, which affects Datastore Search.

1. Further enhancements to the Excel-optimised CSV format: Some multivalued columns returned by the Datastore Search had the potential to be incorrectly interpreted as long numbers by Excel (e.g. "41119,41119,41119" could in some circumstances be interpreted by Excel as "411194111941119"). To preserve the separation of these values and prevent incorrect interpretation by Excel, multivalued columns on the Excel-optimised CSV format are now separated by a pipe character (i.e. the example above will appear as "41119|41119|41119").


Full details of the changes can be seen in the GitHub release notes for Datastore Services. Many thanks to those who have used IATI Datastore Search and provided feedback and suggestions for improvements and new features. We continue to welcome all user feedback - do post comments below or email support@iatistandard.org .

Best wishes,

IATI Technical Team

Comments (2)

Siem Vaessen
Siem Vaessen

Does this mean users have to convert pipes back to its original chars - "41119|41119|41119" needs to be converted back to "41119,41119,41119" ?

IATI Technical Team
IATI Technical Team

The pipe is just a delimiter that denotes the cell is multivalued. If the user is currently parsing multivalued cells with a comma as a delimiter they would need to change it to a pipe.

Taking an example activity https://datastore.iatistandard.org/activity/41119-AF-S1-RT this has 3 entries of the element < participating-org>.

```< participating-org ref="41119" role="1" type="40">
< narrative>United Nations Population Fund< /narrative>
< /participating-org>
< participating-org ref="41119" role="2" type="40">
< narrative>United Nations Population Fund< /narrative>
< /participating-org>
< participating-org ref="41119" role="3" type="40">
< narrative>United Nations Population Fund< /narrative>
< /participating-org>
`

So when it's flattened by the Datastore, the participating_org_ref field holds a list of 3 values. Each of those values is "41119".

In JSON, this is represented as an Array:
"participating_org_ref": ["41119", "41119", "41119"]

In CSV, this is represented as:
,participating_org_ref,
,"41119,41119,41119",

In our Excel optimised CSV, it's represented as:
,participating_org_ref,
,"41119|41119|41119",

Note this change only affects the files returned from the EXCEL Download button in Datastore Search.

With the Datastore API, the CSV multivalued delimiter defaults to a comma and can be modified by the user as desired with the query parameter csv.mv.separator. https://solr.apache.org/guide/8_11/response-writers.html#csv-response-w…

Hope that makes sense. Let us know if you have further questions.


Please log in or sign up to comment.