If you have data files on your local machine that you want to analyze with Zepl, you can upload the file by clicking the right menu bar in your notebook and choosing the Upload file button. Uploaded files are only accessible through the notebook in which they were uploaded. To access the same file in a different notebook, the file will need to be uploaded to each notebook separately.
Common file formats uploaded include:
.CSV: Used to load small sample data files
.PARQUET: Used to upload sample data files
.PKL: Bring pre-trained models to your Zepl notebooks
Zepl supports files up to 25MB in size. Each notebook may not exceed 100MB in total
Once the file is uploaded to the notebook, you can access the file through the following URL (where <file-name> is the name of the file):
http://zdata/<file-name>
Use the examples below to load you data directly from the file system endpoint (URL) where the data is located. This method will directly load data into into a data object using the language of your choice.
%pythonimport pandas as pdpandas_df = pd.read_csv('http://zdata/titanic3.csv', sep=';', header='infer')
%sparkimport org.apache.spark.SparkFilessc.addFile("http://zdata/titanic3.csv")val sparkDF = spark.read.format("csv").option("delimiter", ";").option("header", "true").option("inferSchema", "true").load(SparkFiles.get("titanic3.csv"))
%spark.pysparkfrom pyspark import SparkFilessc.addFile('http://zdata/titanic3.csv')sparkDF = spark.read.format('csv').options(delimiter=';', header='true', inferSchema='true').load(SparkFiles.get('titanic3.csv'))
%rtable <- read.table("http://zdata/titanic3.csv", header = TRUE, sep = ",", dec = ".")
%spark.rspark.addFile("http://zdata/titanic3.csv")sparkDF <- read.df(path = spark.getSparkFiles("bank.csv"), source = "csv", delimiter = ";", header = "true", inferSchema = "true")
Zepl allows users to download data files directly to container's file system. Often this method is used when the original file needs to be modified after your notebook has executed.
%python!wget http://zdata/<file-name>
%python!ls <file-name>
%pythonimport pandas as pdpandas_df = pd.read_csv('titanic3.csv', sep=';', header='infer')
%sparkimport org.apache.spark.SparkFilesval sparkDF = spark.read.format("csv").option("delimiter", ";").option("header", "true").option("inferSchema", "true").load("titanic3.csv")
%spark.pysparkfrom pyspark import SparkFilessparkDF = spark.read.format('csv').options(delimiter=';', header='true', inferSchema='true').load('titanic3.csv')
%rtable <- read.table("titanic3.csv", header = TRUE, sep = ",", dec = ".")
%spark.rsparkDF <- read.df(path = "titanic3.csv", source = "csv", delimiter = ";", header = "true", inferSchema = "true")
You cannot edit data directly within Zepl, but you can overwrite the data file by uploading a file with the same name.
Overwritten data cannot be recovered.
To delete data, click the red "x" button next to the data file in the Files tab in your notebook.
Deleted data cannot be recovered.