PdfParserProject | RecentChanges | Preferences

LineStream parses a read stream using the carriage return or line feed (or both) delimiter.
The carriage return and line feed behavior seem to differ upon file transfer or between platforms. Here is what my local experts say about how to parse a line from a file. Comments?

"Read a line of text. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed. that's the definition of readLine() from the BufferedReader? object in java."

-- Patty

Sounds good...I think we should check for all possibilities... --Marcia

Let me know how you want to work this - who will make the change and how to transfer the changes. (I'm fine with doing it - just want to make sure) -- Patty

  1. I already emailed the category file outs without the 3 types of checks for a carriage return.
  2. If there seems to be an error in our test case, it's only with LineStream correct?
  3. We'll put the fix as part of our Demo3 rev.
  4. Patricia if you could do the fix and email just the method file out NOT the entire class to Liz. Then we'll compile a new image and email everyone a clean copy.
-- Marcia

Correct, it only affects LineStream, although this will affect classes like PdfFile that build upon it. I'll make my change during pockets of free time today. -- Patty

I noticed last night that LineStream and PdfStreamTokenizer perform the same task, though the functionality of LineStream is a subset of the other class. The stream tokenizer has the advantage that it holds a collection of delimiter symbols and parses on those. The only thing that would need to change for stream tokenizer to act like LineStream would be the removal of the space ' ' delimiter... it also omits all delimiters, so "<cr><lf> " is the same as "<cr> <cr><lf> " etc. Not to throw another monkey wrench into the pot. -- Ivan

You mean the parsing stuff??? True - I just reused the code that was shown in one of the lectures. We could probably refactor some of the code and get rid of this class all together -- Liz

I wasn't sure how to get PdfStreamTokenizer to detect a cr-lf combination so I went with LineStream's current next method. I just sent that to you guys in email. The test files I used are on our CSIL accounts in the ~pozon directory (my home dir). I didn't want to email these because I suspect MIME does something weird with attachments and newlines (I might be totally off-base). The filenames in that directory are: If you need/want to grab these and test on your local machines and are using ftp, please be sure to set the mode to binary (type 'bin' before transferring). Let me know if there are problems with the file out. To really view the newlines of these files, type 'od -c' at the unix prompt. -- Patty 11/21/2000
PdfParserProject | RecentChanges | Preferences
This page is read-only (last edited November 30, 2000 6:21 pm (diff))