| Forum List • Thread List • Reply • Refresh • New Topic • Search • Previous • Next 1 | 1. Comma delimited text conversion #1319 Posted by: jwebb 2002-11-10 08:50:58 | I need to read a .csv file, format the data a bit (sorting, etc.) then produce an HTML file for output.
Would someone please give me a nudge in the right direction? Perhaps a sample of similar scope? | 2. How to tokenize a string? #1321 | I also want to know how to tokenize a string by a certain delimitor and return an array. There is no string/word built-in function to do exactly what I want. | 3. Re: Comma delimited text conversion #1324 Posted by: 2002-11-10 13:22:17 | Because a .csv file is a text file, you can read all of the lines into a stem variable easily with Reginald's LOADTEXT(). Each line would be in a separate variable.
To tokenize a string, look at the PARSE keyword. When the string you want to parse is the value of some variable, use PARSE VAR. (Most of time, that's the case). If the string is the return of some function, use PARSE VALUE... WITH. If you're getting the string directly from the user then check out PARSE PULL.
A PARSE statement can break apart a string in all sorts of ways. It can break it apart by pattern matching (ie, search strings), or by blank spaces, or by a certain number of characters or offset in the string, or any combination of those.
If you want to break the string apart by a certain delimitor (such as a comma), then check out the page (in my REXX book) "Using search strings (to break apart tokens)". There's an example to break apart a string at a semi-colon and then a following comma. | 4. Re: Comma delimited text conversion #1329 | .csv is "comma, space, value" -- a text file.
Seems PARSE would be the best solution, but, if there's an unknown number of fields in the input string, PARSE can't parse it in one statement -- need a loop.
By the way, maybe need a cross link between the PARSE and String/word functions section in the doc. I spent more than 10 minutes to find the example you mentioned. PARSE is a special statement in REXX. I didn't notice that at first, and just looked through the string functions to try to find a suitable function to do what PARSE does. | 5. Re: Comma delimited text conversion #1334 Posted by: 2002-11-10 21:22:23 | There is an entire section named "Parsing" in the REXX book because PARSE is a very, very useful and important keyword. I don't know how you could have missed that section.
A csv file is a text file where each line contains numerous "values" separated by a comma, something like this:item 1, item 2, item 3 In that case, if you have an entire line in the variable MyVar, you can break off each "value" as so:
DO WHILE myvar = ""
PARSE VAR myvar val ',' myvar
val = STRIP(val)
SAY val
END
If you wanted, you could collect the pieces in a stem variable:
count = 0
DO WHILE myvar = ""
PARSE VAR myvar val ',' myvar
count = count + 1
pieces.count = STRIP(val)
END
Just one point. The above works unless you're dealing with some value that has embedded commas. For example, some csv files apparently have values that are quoted in order to allow embedded commas. For example, the line to parse may be:item 1, "item 2, with an embedded comma", item 3 Furthermore, it appears that some databases use that "trick" of embedding a single quote inside of a quoted string, by putting two of them back to back (just like you can do in a REXX literal string).
So, to account for these extra "got-cha's", you could call the following function to parse one line of a CSV file:
parsecsv: PROCEDURE EXPOSE (array)
inside = 0
count = 0
orig = STRIP(ARG(1))
IF orig == "" THEN DO
IF RIGHT(orig, 1) == ',' THEN orig = orig || ','
totallength = LENGTH(orig)
startpos = 1
DO i = 1 TO totallength
SELECT
WHEN SUBSTR(orig, i , 1) == ',' & ~inside THEN
DO
piece = STRIP(SUBSTR(orig, startpos, i - startpos))
startpos = i + 1
IF LEFT(piece, 1) == '"' & RIGHT(piece, 1) == '"' & piece == '"' THEN
piece = SUBSTR(piece, 2, LENGTH(piece) - 2)
count = count + 1
CALL VALUE array || count, CHANGESTR('""', piece, '"')
END
WHEN SUBSTR(orig, i , 1) = '"' THEN inside = 1 - inside
OTHERWISE NOP
END
END
END
CALL VALUE array || "0", count
IF inside THEN RETURN "The original line has an odd number of double quote characters!"
RETURN "" | Forum List • Thread List • Reply • Refresh • New Topic • Search • Previous • Next 1 |
|
|