I just saved a couple charts, the green is the predicted output, the red is the actual values. The predicted outputs seem clamped with these settings:

../vowpalwabbit/vw -d training.txt -k -c -f btce.model --loss_function squared -b 25 --passes 20 -q ee --l2 0.0000005

No decimation (downsampling) ~20K datapoints:

Downsampled with a factor of 8 (~2.5K datapoints):

../vowpalwabbit/vw -d training.txt -k -c -f btce.model --loss_function squared --passes 20 --l2 0.0000005

This model worked better, looking at it closely you can see:

And this is only working with about a fifth of the data collected so far. Crazy that it actually seems to work sort of... in a muted sense.

Here's the graphing code for good measure:

```
#! /usr/bin/python2
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
actual_values = []
predicted_values = []
with open('test.txt', 'r') as test_f:
for line in test_f:
actual_values.append(float(line.partition("|")[0]))
with open('predictions.txt', 'r') as predictions_f:
for line in predictions_f:
predicted_values.append(float(line))
# Decimate the charts
# actual_values = signal.decimate(actual_values, 10)
# predicted_values = signal.decimate(predicted_values, 10)
data_len = len(actual_values)
print data_len
x = np.arange(0, data_len)
plt.plot(x, actual_values, 'r-', x, predicted_values, 'g-')
plt.show()
```

I'll probably work on working with gzipped datasets/vw cache files exclusively as I move forward. I'm just worried about dealing with bigger datasets. Whee!