Ultra-large file handling, the easy way

Enterprises will continue to produce and consume huge sets of data and transfer these between systems.

At what file size do your systems give up?


DataFeedr is your one-stop solution for files of any size, any type and any format.

Start a free trial and add see the power and usability of DataFeedr.

Try Now

Still building bespoke solutions to work around infrastructure limitations?

DataFeedr provides full control over memory and CPU resources when handling ultra-large files.

Increasing or unpredictable file sizes are no longer a risk to your system's availability.


DataFeedr is your one-stop solution for files of any size, any type and any format.

Start a free trial and add see the power and usability of DataFeedr.

Try Now

Building robust file integrations in under an hour

DataFeedr takes care of all file management and processing complexities for you,

allowing you to focus solely on writing business logic for data transformation and data delivery.


DataFeedr is your one-stop solution for files of any size, any type and any format.

Start a free trial and add see the power and usability of DataFeedr.

Try Now

Learn More

Ultra-large File Handling

DataFeedr is a file integration platform that enables its users to rapidly create file integrations capable of robustly and reliably processing data files of any size, type or format.

Whether the use case concerns CSV files, SAP IDOCs, or any other XML, plain text or binary file, with sizes from 1KB to 1TB and beyond, DataFeedr will take care of all the file management and pre-processing tasks for you and all configuration-driven.

The performance and capacity of DataFeedr is only limited by the configuration you define and the underlying infrastructure resources.

Mockup Image

Everything Configuration

DataFeedr works with the concept of Tasks. A Task is a configuration that defines a file integration. It determines where the software checks for newly arrived files, how these files are processed and where the files are moved to once completed.

The processing instructions included in the Task configuration control both the manner in which the data is handled as well as the resource usage controls.

Resource control comes with strict governance, enabling administrators to set global restrictions and developers to set Task-level restrictions.

Mockup Image


Data Handling

To remain focused and lean, DataFeedr does not provide functionality for hosting data handler logic. Following the paradigm "Do one thing really well" it allows users to create data handling logic on any platform of choice, provided the handler can be accessed over HTTP(S).

The sole purpose of the data handler (the only actual code a file integration requires) is to receive slices of data, transform them and send the data to a destination, like a database or an event bus or a web service.

Mockup Image

Ready for webMethods IS

DataFeedr can work with any application platform supporting services over HTTP(S) but the integration with Software AG's webMethods platform goes one step beyond.

With the One-Step-Install, you can deploy DataFeedr to alongside webMethds Integration Server. The installation includes a ready-to-run benchmark that can be used verify the installation, to benchmark infrastructure performance or as a blueprint for your first file integration.

With minimal effort, your webMethods IS can process 100GB XML files without problems.

Try Now

DataFeedr

Features

Configuration

Flexible Control

The behaviour of the framework is fully driven by configuration files which are automatically reloaded with each modification.

The configuration hierarchy allows for strict governance by assigning different levels of that hierarchy to different teams or individuals, whom together have full control over usage of system resources.

Capacity

Size Doesn't Matter

DataFeedr can process files of any size (tested upto 1TB), of any type (binary or text), of any file format (XML, CSV, plain text, anything really) and created on any operating system (UNIX, DOS/Windows or MacOS).

In addition, it can be configured to work with any characterset encoding supported by the underlying system.

Performance

Parralel Processing

DataFeedr has been designed for maximum performance and allows multiple tasks to run in parallel. To ensure control over the file sequentiality, the DataFeedr does ensures that files within a single task are processed in the order they were received.

Being built in Java and compatible with any JDK or JRE since 1.8, DataFeedr runs anywhere.

Processing

Data Handlers

Data handlers can be created and hosted anywhere and on any platform. DataFeedr communicates with these data handlers over HTTP(S) using a predefined data and communications protocol.

As the DataFeedr foundation is based on a pluggable architecture, specific data handler implementations can be created for specific types of services. Currently, in addition to generic HTTP data handlers DataFeedr comes with an implementation for calling webMethods services, e.g. Flow services.

Errors

Error Handling

DataFeedr distinguished between functional errors and technical exceptions. An failure is considered a functional error if a data handler does not return the expected response. Any other failure is considered a technical exception.

Similar to data handlers, error and exception handlers can be configured for both types of failure. All available information regarding the failure is pushed to the handler and, in case of functional errors, the failed data slice is included as well.

Visibility

Clear Analytics

A clean and complete dashboard is provided out of the box to inform you of progress, history, performance and active configurations.

DataFeedr's web interface allows you to easily navigate through the dashboard and find metrics that help you further tune and optimise your Task configurations.

Subscribe

Stay up to date on all latest developments.

Use cases

File to Cache

Caching is an essential component in improvement of performance and accessibility of data within the enterprise landscape. Especially when data from traditional enterprise applications is to be cached, this data is generally provided in large data files. Updating a cache or restoring a cache after an outage has become a key factor in the availability of the data in the cache. Also, the time and effort needed to extend your cache with data from additional applications is critical to the adaption and success of your caching strategies.

DataFeedr enables you to realize the 'file-to-cache' use case in a few hours. It is a matter of implementing a data handler service that receives the data in small units of work, transforms the data into the desired structure or de-serializes it into objects, and pushes these into your cache. When an exception occurs within the data handler, the data is handed off to the error handler service and allows you to either reprocess, store for manual handling, start a workflow process or notify an administrator.

File to Database

Similar to the 'file-to-cache' use case, a data handler can be implemented to write the data to a relational or NoSQL database.

Again, implementing such a data handler is a matter of hours, where all complexity of file parsing and file transport is handled by DataFeedr.

File to Events

For those cases where the cache or database is in a remote location, DataFeedr can be used to split files into events or messages that can be transported across the enterprise network using a messaging platform.

File to File

File-to-file integrations, although deemed 'old-fashioned' are still a reality today. This traditional integration pattern is a necessity for the many companies that use legacy enterprise applications. Applications export large data sets in files which need to be transformed into a different file type, format or structure.

Using DataFeedr, realizing this use case requires you to implement a data handler service that receives the data, transforms it and writes it to a file. The data handler complexity is minimal and the full implementation cycle, including design and testing, can be completed in a single workday.

File Splitting

Similar to the 'file-to-messages’ or ‘file-to-events' use cases, DataFeedr can be used to split large files into smaller files using the same data handler logic. The target files can be written to a local or remote file system, depending on the requirements you have.

Data Throttling

In all the above use cases, performance can be tuned to the maximum supported by the underlying infrastructure. However, there may be cases where the receiving enterprise system cannot cope with high throughput of data.

For such cases, DataFeedr can be configured to limit the throughput as per the receiving system's capacity. It will reduce the speed with which it reads the data files, ensuring IO congestion or CPU- or memory-resource thrashing is avoided.

Sample Performance test SAP IDOCs, 4.7GB data file, ALE export on webMethods
1,000

SAP IDOC documents (XML)

256

MB memory allocation

50

CPU threads limit

136

seconds processing duration
(with 50ms simulated latency)

Contact Us

Get in Touch

Email

info@centipod.nl

Address

Centipod BV Mauritsstraat 40 2271 SE Voorburg The Netherlands

Call at

+31 (0)6 233 866 45