I created a new data source about 24 hours ago. The configuration went well (connection to datalogger was successful), except for the Level 3 security code issue, but that was covered in a different post. I set the initial download to grab the last 1 month of data. During the past 24 hours, the status has mostly cycled between DOWNLOADING and FAILED. Some data has been downloaded, but not very much. Data collection has been working fine in LoggerNet.
Request for insight into download failure
Posts 5 | Views 36 | browsing
We're currently looking into this issue.
Please note that i'll disable the Data Source during this process.
The issue relates to an acquisition attempt that exceeded our maximum time limit of 1Hr. In this scenario, records collected during the course of the attempt are now stored and the next acquisition will resume from where the previous truncated attempt left off.
Keep in mind that this logger has a very high number of records in the PerScan table (approaching 1 million records) and during the course of an acquisition attempt lasting 1Hr we observed approximately 12,000 records retrieved, so it may take a number of attempts before the Data Source is up to date. Alternatively, you can load in existing data via the Historic Data Import routine to manually bring the Data Source up to date.
@Jarrah I was assuming that my tables were not particularly large or small, but rather "midsize." I based that assumption on the following quote in the docs: "Historic data is always associated with an individual Node. You can store more than a billion individual data points per time series. This may include high-resolution millisecond data or historic records from the past 100+ years." The data sources that I've added so far and was planning to continue adding have nominal 20 second timestamp intervals and usually between 20 and 200 parameters each. One of the sources has about 3 years of data; the others have about 8 months of data.
Some other questions/notes [Perhaps your answers/guidelines would be valuable to other customers if they were added to the official docs b/c I expect other customers (including some of my own clients who may want their own eagle.io accounts in the future) will have similar requirements.]:
In your last sentence, you mentioned two approaches for loading the existing data...Historical Data Import or manually bringing the Data Source up to date. Aren't they the same thing?
The docs mention a 250 MB limit for Historic Data Imports. Does the 1 hour limit also apply to this type of operation?
If the historical data files needs to be broken up into much smaller chunks to avoid exceeding the 250 MB/1 hr limits, it is necessary to reconfigure each import from scratch for each file, even if the data file "schema" isn't changing from one import to the next? I'm going to play around with this all day today, so perhaps I'll answer my own question.
If moving "big data" directly into your system over the internet is impractical for technical reasons, perhaps you can offer a paid service in which someone can snail mail large files to you or share them via a cloud file storage service and then you enter them into your system locally (unless you're using a cloud PaaS/IaaS approach, in which you might be dealing with the same technical challenging). I have about 2-3 GB of raw historical data that I'd eventually like to put in eagle.io, but it sounds like it could take forever to do so using the eagle.io web interface. However, I could dump all the data into a cloud file storage service in a negligible amount of time.
When you refer to a "record," i.e., "12,000 records retrieved," do you mean a CSV file row or each timestamp/value pair?
Is there a way to cancel a collection operation that seems to be "stuck"? On my end, it looked like nothing was happening, so I was confused. The number of records indicator remained at zero (except for brief periods when it would increase before it dropped back down to zero), the state indicator fluctuated between DOWNLOADING and FAILED, the latest timestamp indicator moved forward and backward in time (such that I assumed the system was aborting and restarting the import from scratch), and there was no indication of the nature of the problem or description of the 1 hour limit in the docs. The system appeared to be hung up, but I didn't see any way to cancel it.
Right now, I have 7 dataloggers that I'm resposible for. I'm trying to figure out the best way to get all the data into eagle.io. Here is a summary of the dataloggers:
5 of them have 8 months of 20-second data with about 70 parameters each. All these data are located in stored in 2 places: 1) on the onboard NL115 Compact Flash card (physically located in Oman); and 2) one archived CSV per month per datalogger (so, about 8 * 5 = 40 files).
1 of the dataloggers is in Iowa. I recently deleted the onboard CF card, so all these data are on my computer as monthly CSVs. I have about 36 months of 1-minute data from 2012 to June 2015. I recently updated the program to start storing 20-second data.
This issue relates to the one-time problem of bringing a large number of historical records into eagle.io from a data logger; in this context, "large" really means anything that takes more than one hour to fully download from the device. Every acquisition session is limited to a maximum of one hour, in order to balance our server resources. So after one hour of continuous communication with a device, we will end that communication and acquire whatever subset of records we could during that hour. The next attempt will acquire the next subset of records, until eventually we have acquired all the historical records.
The documentation refers to eagle.io's storage capacity, not the size of the table data stored on the logger - its the latter that was described as 'large'. Naturally, the process of actually ingesting all this data into our system for the very first time can require some patience, and inevitably some repeated communications attempts.
We do offer a number of alternative methods for our customers to push large amounts of data into the system, including FTP and API access, but these methods would still require you to have the historical data in file form. Also, please note that currently you cannot acquire some records from a text file source via FTP, and then convert that to a datalogger source for additional acquisition. If you specifically want to back-fill records for a datalogger source, you would need to use the UI or the API to provide those records.
In this current situation where the historical data is inside loggers, the best method may be to just allow multiple 1-hour acquisitions to build the complete data set over time. The total time required by this approach will be limited by the speed of communication with the device.
To answer your specific questions in more detail:
Yes, Historical Data Import is a way of manually bringing the data source up to date.
Not exactly; while all processing by our acquisition system is limited to 1 hour sessions, the historic data would first be received by our FTP server before being passed to the acquisition system, and therefore would be processed very quickly. The only reason that acquisition from devices can take a long time is that our system communicates directly with the device, sometimes over a slow link such as a serial connection. Obviously an FTP upload will happen much faster, and we don't limit the amount of time you can be uploading to our FTP server.
Yes, if you are breaking up a larger file into 250MB chunks, it is necessary to configure each upload. One alternative for bulk data imports is to use our API.
Although we don't have the ability to receive snail-mail data, we do have a cloud based alternative; if you upload your historical data into Dropbox (and your 2-3 GB of data would fit in a free Dropbox account) then eagle.io can acquire directly from Dropbox which is fast because the data is moving directly from Dropbox servers to eagle.io servers. Obviously you still need to wait for your local files to sync into Dropbox. This method is very popular with our customers who are appending to data files on their computers; they simply put their data files in their Dropbox directory, and we can automatically acquire the new data when it is appended to the file.
A "record" in our system means one timestamped row of data, regardless of the number of values in that row.
We understand the need to better communicate the status of long-running operations; currently it can be difficult to decide if the operation is stuck or something is actually happening in the background. It can also be confusing that the data source state is set to FAILED after the one hour limit causes a disconnect, even though some partial records were actually acquired. Customers in your situation would experience several successive FAILED attempts, even though (some) data is being successfully acquired each time, and all the data will be successfully acquired eventually. We will work on improving feedback during the acquisition process.
After reading the summary of your 7 dataloggers, I don't see any problem with bringing that data into eagle.io, preferably using the CSV archives because this would be faster than communicating directly with the loggers.