Line
Line operations consist of the
filters,
transformers, and
bufferings. Each of these
are described in
Basics and
Reference documentation.
In addition to these basic operations, the following operations are available:
The rest of this page illustrates some usages these operations.
Boolean Examples
Here are some examples using boolean
Line filters.
not:
const String input = @"Line one: house
Line two: cat, dog
Line three: separation of economics and state
";
using (Pnyx p = new Pnyx())
{
p.readString(input);
p.not(p2 => p2.grep("state"));
p.writeStdout();
}
// outputs:
// Line one: house
// Line two: cat, dog
or:
const String input = @"Line one: house
Line two: cat, dog
Line three: separation of economics and state
";
using (Pnyx p = new Pnyx())
{
p.readString(input);
p.or(p2 =>
{
p2.grep("cat");
p2.egrep("Line one.*");
});
p.writeStdout();
}
// outputs:
// Line one: house
// Line two: cat, dog
Custom Operations
The built-in
Line opertions for
Pnyx only cover the very basics. For any advanced application, custom
operations will be used extensively. In this case, each
Line interface is simple to implement and easily tested
with Unit Test. In addition to implementing interfaces,
Pnyx
supports using Lambda expression for
quicker, more flexible usage. The following examples shows the two
Line operations with Lambda expressions.
lineFilterFunc:
const String input = @"Text1with0numbers
log3message
Oliver Twist
";
using (Pnyx p = new Pnyx())
{
p.readString(input);
p.lineFilterFunc(line =>
{
String numbers = TextUtil.extractNumeric(line);
return numbers.Length > 0 && int.Parse(numbers) > 5;
});
p.writeStdout();
}
// outputs:
// Text1with0numbers
lineTransformerFunc:
const String input = @"Text1with0numbers
log3message
Oliver Twist
";
using (Pnyx p = new Pnyx())
{
p.readString(input);
p.lineTransformerFunc(line =>
{
String numbers = TextUtil.extractNumeric(line);
return numbers;
});
p.writeStdout();
}
// outputs:
// 10
// 3
Convert To Row
Line data can be converted into
Row data, which is essentially a
List<String>
. There
are three built-in methods for
parsing a
Line into a
Row. These methods:
parseCsv,
parseDelimiter, and
parseTab are documented in
Reference, Line, Row Conversion.
For advanced parsing,
IRowConverter
should be implemented and used with
lineToRow method. Below are
some examples of converting
Line data to
Row data.
parseDelimiter
const String input = "a|b|c|d|e|f|g";
using (Pnyx p = new Pnyx())
{
p.readString(input);
p.sed("[aceg]", @"\0\0", "gi"); // duplicates every other char
p.parseDelimiter("|");
p.print("$1,$3,$5,$7|$2,$4,$6");
p.writeStdout();
}
// outputs: aa,cc,ee,gg|b,d,f
IRowConverter
public class ExampleRowConverter : IRowConverter
{
public List<String> lineToRow(string line)
{
Tuple<String,String> pair = TextUtil.splitAt(line, ":=");
return new List<String> { pair.Item1.Trim(), pair.Item2.Trim() };
}
public string rowToLine(List<String> row)
{
return String.Format("{0} := {1}", row[0], String.Concat(row.Skip(1)));
}
public IRowProcessor buildRowDestination(StreamInformation x, Stream y)
{
return null; // return 'null' if StreamToLineProcessor is acceptable
}
}
const String input = "set x := (set == 0 ? 0 : 100 / set)";
using (Pnyx p = new Pnyx())
{
p.readString(input);
p.lineToRow(new ExampleRowConverter());
p.withColumns(p2 => p2.sed("set[ ]*", "var ", "i"), 1); // replace 1st column
p.writeStdout(); // auto converts back to line
}
// outputs: var x := (set == 0 ? 0 : 100 / set)
Embedded Newlines
When working with CSV data that contains embedded newlines, meaning newlines within the content of a column, then converting
from
Line to
Row using
parseCsv will produce undesired results. This is due to the standard
line reader (
StreamToLineProcessor
), which ends a line upon parsing newline. To solve this issue, instead
use method
asCsv, which uses
CsvStreamToRowProcessor
to parse directly from the
Stream with
out being read as a
Line first. The following examples illustrate the differences between
parseCsv vs.
asCsv.
parseCsv
const String input = "a,\"Long\nText\n\"";
using (Pnyx p = new Pnyx())
{
p.readString(input); // StreamToLineProcessor
p.print("$0"); // forces line state
p.parseCsv(strict: false);
p.selectColumns(2,1);
p.writeStdout();
}
// outputs:
// Long,a
// ,Text
// ,
asCsv
const String input = "a,\"Long\nText\n\"";
using (Pnyx p = new Pnyx())
{
p.asCsv(p2 => p.readString(input)); // CsvStreamToRowProcessor
p.selectColumns(2,1);
p.writeStdout();
}
// outputs:
// "Long\nText\n",a
Finally, there is an internal optimization to automatically read Line data directly from source Stream
for the "initial" parse method. The method IRowConverter.buildRowDestination
is used whenever Pnyx can auto-wire a Input operation with a IRowConverter. This is illustrated
below with CSV parsing, using a similar example as above. However, notice that p.print
is missing,
which forces Pnyx to parse as a Line first.
Auto-wire asCsv
const String input = "a,\"Long\nText\n\"";
using (Pnyx p = new Pnyx())
{
p.readString(input); // CsvStreamToRowProcessor (auto-wired)
p.parseCsv();
p.selectColumns(2,1);
p.writeStdout();
}
// outputs:
// "Long\nText\n",a
Next
Suggested next steps:
- Row, learn more about Row operations
- Input, learn more about Input operations
- Output, learn more about Output operations