There is an entire section named "Parsing" in the REXX book because PARSE is a very, very useful and important keyword. I don't know how you could have missed that section.
A csv file is a text file where each line contains numerous "values" separated by a comma, something like this:item 1, item 2, item 3 In that case, if you have an entire line in the variable MyVar, you can break off each "value" as so:
DO WHILE myvar = ""
PARSE VAR myvar val ',' myvar
val = STRIP(val)
SAY val
END
If you wanted, you could collect the pieces in a stem variable:
count = 0
DO WHILE myvar = ""
PARSE VAR myvar val ',' myvar
count = count + 1
pieces.count = STRIP(val)
END
Just one point. The above works unless you're dealing with some value that has embedded commas. For example, some csv files apparently have values that are quoted in order to allow embedded commas. For example, the line to parse may be:item 1, "item 2, with an embedded comma", item 3 Furthermore, it appears that some databases use that "trick" of embedding a single quote inside of a quoted string, by putting two of them back to back (just like you can do in a REXX literal string).
So, to account for these extra "got-cha's", you could call the following function to parse one line of a CSV file:
parsecsv: PROCEDURE EXPOSE (array)
inside = 0
count = 0
orig = STRIP(ARG(1))
IF orig == "" THEN DO
IF RIGHT(orig, 1) == ',' THEN orig = orig || ','
totallength = LENGTH(orig)
startpos = 1
DO i = 1 TO totallength
SELECT
WHEN SUBSTR(orig, i , 1) == ',' & ~inside THEN
DO
piece = STRIP(SUBSTR(orig, startpos, i - startpos))
startpos = i + 1
IF LEFT(piece, 1) == '"' & RIGHT(piece, 1) == '"' & piece == '"' THEN
piece = SUBSTR(piece, 2, LENGTH(piece) - 2)
count = count + 1
CALL VALUE array || count, CHANGESTR('""', piece, '"')
END
WHEN SUBSTR(orig, i , 1) = '"' THEN inside = 1 - inside
OTHERWISE NOP
END
END
END
CALL VALUE array || "0", count
IF inside THEN RETURN "The original line has an odd number of double quote characters!"
RETURN "" |