State Machine

Internally, Pnyx has 4 primary states that control which method calls are legal. States are transitioned based upon specific methods (like read, write, parse, print, etc). Each state restricts methods, and will throw an IllegalStateException when invalid. The following diagram illustrates the primary states and their transistions.
State Diagram

Use Once

Once familiar with the fluent API, the usage will be natural. However, their are two non-obvious design limitations. The first is simply that a Pnyx can only be used once. If your project needs to reuse Pnyx objects, then it is recommend to create a factory method for building new instances.

Input / Output

When looking at the state machine, it is important to note that all I/O reads and writes are done in the process state. All other steps simply build filters, transformers, bufferings and processors which are used during the process state.

Single Input and Output

The other non-obvious limitation is the assumption of only 1 input and 1 output. This limitation is more a matter of style than a true limitation. Multiple inputs are accommodated via the cat method, and multiple outputs are accommodated via the tee method. Since the most common usage will be single inputs / outputs, the API was written to enforce the most common usage, with explicit methods for multiple inputs.

The following examples shows how to use cat for reading from multiple sources, including how to read as CSV.
using (Pnyx p = new Pnyx())
{
    p.cat(pn =>
    {
        pn.readString("Line one");
        pn.readString("Line two");                    
        pn.readString("Line three");
        // ...
    });
    p.writeStdout();
}                        
// outputs:
// Line one
// Line two
// Line three
using (Pnyx p = new Pnyx())
{
    p.cat(p2 =>
    {
        p2.asCsv(p3 =>
        {
            p3.readString("Line,one");
            p3.readString("Line,two");
            p3.readString("Line,three");
        });
    });
    p.selectColumns(2, 1);
    p.writeStdout();
}
// outputs:
// one,Line
// two,Line
// three,Line

Tee

In addition to cat, Pnyx has support for multiple outputs with the tee processor. The tee processor internally creates a separate Pnyx object, which can be used for writing additional outputs. Since a separate Pnyx object is used, the tee processor can also perform separate operations on the data to create different outputs. Finally, the tee method does not change the state to End, and therefore, additional operations can be performed after the tee.

The example below shows using a tee to split a CSV by column to create 2 separate output files.
using (Pnyx p = new Pnyx())
{
    p.readString("1975,218M,\"Love Will Keep Us Together\"\n");
    p.parseCsv();
    p.tee(p2 =>
    {
        p2.selectColumns(1, 2);
        p2.write("us_population_by_year.csv");
        // outputs: 1975,218M                    
    });
    p.selectColumns(1, 3);
    p.write("top_songs_by_year.csv");
    // outputs: 1975,"Love Will Keep Us Together"
}
A subtle, but powerful, side effect the tee processor is that original state of the Pnyx is unchanged, and therefore, can be used for addition operations, including additional tees. The following example shows using 2 tee processors.
using (Pnyx p = new Pnyx())
{
    p.readString("clientId: 123456\n");
    p.tee(p2 =>
    {
        p2.write("copy.txt");
        // outputs: clientId: 123456
    });
    p.parseDelimiter(": ");                
    p.selectColumns(2);
    p.tee(p2 =>
    {
        p2.write("ids.txt");
        // outputs: 123456
    });
    p.print("delete from client where id = $0;");
    p.writeStdout();
    // outputs: delete from client where id = 123456;
}

Next

Suggested next steps:
  • Line, learn more about Line operations
  • Row, learn more about Row operations