ETL - Sources
Source components represent the source of the data to be extracted. Some Extractors like JDBCExtractor work without a source, and thus can be optional.
Available Sources
file | input | http |
---|---|---|
file
Represents a source file, from which data is read. Files can be text files or compressed with tar.gz.
- Component name: file
Syntax
Parameter | Description | Type | Mandatory | Default value |
---|---|---|---|---|
path | File path | string | true | - |
lock | Lock the file while the extraction phase | boolean | false | false |
encoding | File encoding | string | false | UTF-8 |
Example
Extracts from the file "/temp/actor.tar.gz":
{ "file": { "path": "/temp/actor.tar.gz", "lock" : true , "encoding" : "UTF-8"} }
input
Extracts data from console input. This is useful when the ETL works in a PIPE with other tools
- Component name: input
Syntax
Parameter | Description | Type | Mandatory | Default value |
---|---|---|---|---|
Example
Extracts the file as input
cat /etc/csv|oetl.sh "{transformers:[{csv:{}}]}"
http
Uses an HTTP endpoint as a data source.
- Component name: http
Syntax
Parameter | Description | Type | Mandatory | Default value |
---|---|---|---|---|
url | HTTP URL to invoke | String | true | - |
method | HTTP Method between "GET", "POST", "PUT", "DELETE", "HEAD", "OPTIONS", "TRACE" | String | false | GET |
headers | Request headers as inner document key/value | Document | false |
Example
Execute an HTTP request against the URL "http://ip.jsontest.com/" in a GET, setting the User-Agent in the headers:
{ "http": {
"url": "http://ip.jsontest.com/",
"method": "GET",
"headers": {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36"
}
}
}