mirror of
https://github.com/ultravioletrs/cocos.git
synced 2026-06-23 04:10:25 +00:00
NOISSUE - Enable WASM Support and FileSystem Support (#189)
* feat(algorithm): Add wasm as an algo type Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * feat(algorithm): Use filesystem to store results Move from unix socket for results storage to filesystem * test: test new filesystem changes Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * refactor(files): rename resultFile to resultsFilePath * feat(wasm-runtime): change from wasmtime to wasmedge Wasmedge enables easier directory mapping to get results Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * feat(algorithm): send results as zipped directory Create a new function to zip the results directory and send it back to the user * fix(wasm): runtime argument Fix the directory mapping for wasm runtime arguments Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * fix(errors): provide useful error message * chore(gitignore): add results zip to gitignore * feat(filesystem): Enable storing results on filesystem for python algos * refactor: revert to upstream cocos repo Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * fix: remove AddDataset from algorithm interface * fix: agent to handle results zipping * test: test zipping directories * refactor(agent): Handle file operations from agent * test: run test inside eos Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * refactor(test): Document and test algos are running Document steps on running the 2 python exampls and ensure they are running on eos Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * fix: remove witheDataset option * test: test without dataset argument Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> --------- Signed-off-by: Rodney Osodo <socials@rodneyosodo.com>
This commit is contained in:
+98
-10
@@ -1,17 +1,105 @@
|
||||
# Algorithm
|
||||
|
||||
Agent accepts binaries programs. To use the python program you need to bundle or compile it.
|
||||
In this example we'll use [pyinstaller](https://pypi.org/project/pyinstaller/)
|
||||
Agent accepts binaries programs, python scripts, and wasm files. It runs them in a sandboxed environment and returns the output.
|
||||
|
||||
```shell
|
||||
pip install pandas scikit-learn
|
||||
pip install -U pyinstaller
|
||||
pyinstaller --onefile lin_reg.py
|
||||
## Python Example
|
||||
|
||||
To test this examples work on your local machine, you need to install the following dependencies:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Make the binary static:
|
||||
This can be done in a virtual environment.
|
||||
|
||||
```shell
|
||||
pip install staticx
|
||||
staticx <dynamic_binary_file_path> <output_file_path>
|
||||
```bash
|
||||
python -m venv venv
|
||||
source venv/bin/activate
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
To run the example, you can use the following command:
|
||||
|
||||
```bash
|
||||
python3 test/manual/algo/addition.py
|
||||
```
|
||||
|
||||
The addition example is a simple algorithm to demonstrate you can run an algorithm without any external dependencies and input arguments. It returns the sum of two numbers.
|
||||
|
||||
```bash
|
||||
python3 test/manual/algo/lin_reg.py
|
||||
```
|
||||
|
||||
The linear regression example is a more complex algorithm that requires external dependencies.It returns a linear regression model trained on the iris dataset found [here](../data/) for demonstration purposes.
|
||||
|
||||
```bash
|
||||
python3 test/manual/algo/lin_reg.py predict result.zip test/manual/data
|
||||
```
|
||||
|
||||
This will make inference on the results of the linear regression model.
|
||||
|
||||
To run the examples in the agent, you can use the following command:
|
||||
|
||||
```bash
|
||||
go run ./test/computations/main.go ./test/manual/algo/lin_reg.py public.pem false ./test/manual/data/iris.csv
|
||||
```
|
||||
|
||||
This command is run from the root directory of the project. This will start the computation server.
|
||||
|
||||
In another window, you can run the following command:
|
||||
|
||||
```bash
|
||||
sudo MANAGER_QEMU_SMP_MAXCPUS=4 MANAGER_GRPC_URL=localhost:7001 MANAGER_LOG_LEVEL=debug MANAGER_QEMU_USE_SUDO=false MANAGER_QEMU_ENABLE_SEV=false MANAGER_QEMU_SEV_CBITPOS=51 MANAGER_QEMU_ENABLE_SEV_SNP=false MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd go run main.go
|
||||
```
|
||||
|
||||
This command is run from the [manager main directory](../../../cmd/manager/). This will start the manager. Make sure you have already built the [qemu image](../../../hal/linux/README.md).
|
||||
|
||||
In another window, you can run the following command:
|
||||
|
||||
```bash
|
||||
./build/cocos-cli algo ./test/manual/algo/lin_reg.py ./private.pem -a python -r ./test/manual/algo/requirements.txt
|
||||
```
|
||||
|
||||
make sure you have built the cocos-cli. This will upload the algorithm and the requirements file.
|
||||
|
||||
Next we need to upload the dataset
|
||||
|
||||
```bash
|
||||
./build/cocos-cli data ./test/manual/data/iris.csv ./private.pem
|
||||
```
|
||||
|
||||
After some time when the results are ready, you can run the following command to get the results:
|
||||
|
||||
```bash
|
||||
./build/cocos-cli results ./private.pem
|
||||
```
|
||||
|
||||
This will return the results of the algorithm.
|
||||
|
||||
To make inference on the results, you can use the following command:
|
||||
|
||||
```bash
|
||||
python3 test/manual/algo/lin_reg.py predict result.zip test/manual/data
|
||||
```
|
||||
|
||||
For addition example, you can use the following command:
|
||||
|
||||
```bash
|
||||
go run ./test/computations/main.go ./test/manual/algo/addition.py public.pem false
|
||||
```
|
||||
|
||||
```bash
|
||||
./build/cocos-cli algo ./test/manual/algo/addition.py ./private.pem -a python
|
||||
```
|
||||
|
||||
```bash
|
||||
./build/cocos-cli results ./private.pem
|
||||
```
|
||||
|
||||
## Wasm Example
|
||||
|
||||
More information on how to run wasm files can be found [here](https://github.com/ultravioletrs/ai/tree/main/burn-algorithms).
|
||||
|
||||
## Binary Example
|
||||
|
||||
More information on how to run binary files can be found [here](https://github.com/ultravioletrs/ai/tree/main/burn-algorithms).
|
||||
|
||||
@@ -1,9 +1,14 @@
|
||||
import sys, io
|
||||
import joblib
|
||||
import socket
|
||||
import os
|
||||
import sys
|
||||
import zipfile
|
||||
|
||||
RESULTS_DIR = "results"
|
||||
RESULTS_FILE = "result.txt"
|
||||
|
||||
|
||||
class Computation:
|
||||
result = 0
|
||||
|
||||
def __init__(self):
|
||||
"""
|
||||
Initializes a new instance of the Computation class.
|
||||
@@ -16,45 +21,35 @@ class Computation:
|
||||
"""
|
||||
self.result = a + b
|
||||
|
||||
def send_result(self, socket_path):
|
||||
def save_result(self):
|
||||
"""
|
||||
Sends the result to a socket.
|
||||
Sends the result to a file.
|
||||
"""
|
||||
buffer = io.BytesIO()
|
||||
|
||||
try:
|
||||
joblib.dump(self.result, buffer)
|
||||
except Exception as e:
|
||||
print("Failed to dump the result to the buffer: ", e)
|
||||
return
|
||||
os.makedirs(RESULTS_DIR)
|
||||
except FileExistsError:
|
||||
pass
|
||||
|
||||
data = buffer.getvalue()
|
||||
with open(RESULTS_DIR + os.sep + RESULTS_FILE, "w") as f:
|
||||
f.write(str(self.result))
|
||||
|
||||
client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
|
||||
try:
|
||||
try:
|
||||
client.connect(socket_path)
|
||||
except Exception as e:
|
||||
print("Failed to connect to the socket: ", e)
|
||||
return
|
||||
try:
|
||||
client.send(data)
|
||||
except Exception as e:
|
||||
print("Failed to send data to the socket: ", e)
|
||||
return
|
||||
finally:
|
||||
client.close()
|
||||
|
||||
def read_results_from_file(self, results_file):
|
||||
"""
|
||||
Reads the results from a file.
|
||||
"""
|
||||
try:
|
||||
results = joblib.load(results_file)
|
||||
print("Results: ", results)
|
||||
except Exception as e:
|
||||
print("Failed to load results from file: ", e)
|
||||
return
|
||||
if results_file.endswith(".zip"):
|
||||
try:
|
||||
os.makedirs(RESULTS_DIR)
|
||||
except FileExistsError:
|
||||
pass
|
||||
with zipfile.ZipFile(results_file, "r") as zip_ref:
|
||||
zip_ref.extractall(RESULTS_DIR)
|
||||
with open(RESULTS_FILE, "r") as f:
|
||||
print(f.read())
|
||||
else:
|
||||
with open(results_file, "r") as f:
|
||||
print(f.read())
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
a = 5
|
||||
@@ -62,15 +57,10 @@ if __name__ == "__main__":
|
||||
computation = Computation()
|
||||
|
||||
if len(sys.argv) == 1:
|
||||
print("Please provide a socket path or a file path")
|
||||
exit(1)
|
||||
|
||||
if sys.argv[1] == "test" and len(sys.argv) == 3:
|
||||
computation.read_results_from_file(sys.argv[2])
|
||||
elif len(sys.argv) == 2:
|
||||
computation.compute(a, b)
|
||||
computation.send_result(sys.argv[1])
|
||||
computation.save_result()
|
||||
elif len(sys.argv) == 3 and sys.argv[1] == "test":
|
||||
computation.read_results_from_file(sys.argv[2])
|
||||
else:
|
||||
print("Invalid arguments")
|
||||
exit(1)
|
||||
|
||||
|
||||
+100
-31
@@ -1,47 +1,116 @@
|
||||
import sys, io
|
||||
import os
|
||||
import sys
|
||||
import joblib
|
||||
import socket
|
||||
|
||||
import pandas as pd
|
||||
from sklearn.model_selection import train_test_split
|
||||
from sklearn.linear_model import LogisticRegression
|
||||
import zipfile
|
||||
from sklearn import metrics
|
||||
|
||||
csv_file_path = sys.argv[2]
|
||||
iris = pd.read_csv(csv_file_path)
|
||||
DATA_DIR = "datasets"
|
||||
RESULTS_DIR = "results"
|
||||
RESULTS_FILE = "model.bin"
|
||||
|
||||
# Droping the Species since we only need the measurements
|
||||
X = iris.drop(['Species'], axis=1)
|
||||
|
||||
# converting into numpy array and assigning petal length and petal width
|
||||
X = X.to_numpy()[:, (3,4)]
|
||||
y = iris['Species']
|
||||
class Computation:
|
||||
model = None
|
||||
|
||||
# Splitting into train and test
|
||||
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.5, random_state=42)
|
||||
def __init__(self):
|
||||
"""
|
||||
Initializes a new instance of the Computation class.
|
||||
"""
|
||||
pass
|
||||
|
||||
log_reg = LogisticRegression()
|
||||
log_reg.fit(X_train,y_train)
|
||||
def _read_csv(self, data_path=""):
|
||||
"""
|
||||
Reads the CSV file.
|
||||
"""
|
||||
files = os.listdir(data_path)
|
||||
if len(files) != 1:
|
||||
print("No files found in the directory")
|
||||
exit(1)
|
||||
csv_file_path = data_path + os.sep + files[0]
|
||||
return pd.read_csv(csv_file_path)
|
||||
|
||||
# Serialize the trained model to a byte buffer
|
||||
model_buffer = io.BytesIO()
|
||||
joblib.dump(log_reg, model_buffer)
|
||||
def compute(self):
|
||||
"""
|
||||
Trains a logistic regression model.
|
||||
"""
|
||||
iris = self._read_csv(DATA_DIR)
|
||||
|
||||
# Get the serialized model as a bytes object
|
||||
model_bytes = model_buffer.getvalue()
|
||||
# Droping the Species since we only need the measurements
|
||||
X = iris.drop(["Species"], axis=1)
|
||||
|
||||
# Define the path for the Unix domain socket
|
||||
socket_path = sys.argv[1]
|
||||
# converting into numpy array and assigning petal length and petal width
|
||||
X = X.to_numpy()[:, (3, 4)]
|
||||
y = iris["Species"]
|
||||
|
||||
# Create a Unix domain socket client
|
||||
client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
|
||||
X_train, _, y_train, _ = train_test_split(X, y, test_size=0.5, random_state=42)
|
||||
|
||||
try:
|
||||
# Connect to the server
|
||||
client.connect(socket_path)
|
||||
log_reg = LogisticRegression()
|
||||
log_reg.fit(X_train, y_train)
|
||||
self.model = log_reg
|
||||
|
||||
# Send the serialized model over the socket
|
||||
client.send(model_bytes)
|
||||
def save_result(self):
|
||||
"""
|
||||
Sends the result to a file.
|
||||
"""
|
||||
try:
|
||||
os.makedirs(RESULTS_DIR)
|
||||
except FileExistsError:
|
||||
pass
|
||||
|
||||
finally:
|
||||
# Close the socket
|
||||
client.close()
|
||||
results_file = RESULTS_DIR + os.sep + RESULTS_FILE
|
||||
joblib.dump(self.model, results_file)
|
||||
|
||||
def read_results_from_file(self, results_file):
|
||||
"""
|
||||
Reads the results from a file.
|
||||
"""
|
||||
if results_file.endswith(".zip"):
|
||||
try:
|
||||
os.makedirs(RESULTS_DIR)
|
||||
except FileExistsError:
|
||||
pass
|
||||
with zipfile.ZipFile(results_file, "r") as zip_ref:
|
||||
zip_ref.extractall(RESULTS_DIR)
|
||||
self.model = joblib.load(RESULTS_DIR + os.sep + RESULTS_FILE)
|
||||
else:
|
||||
self.model = joblib.load(results_file)
|
||||
|
||||
def predict(self, data_path=""):
|
||||
iris = self._read_csv(data_path)
|
||||
|
||||
# Droping the Species since we only need the measurements
|
||||
X = iris.drop(["Species"], axis=1)
|
||||
|
||||
# converting into numpy array and assigning petal length and petal width
|
||||
X = X.to_numpy()[:, (3, 4)]
|
||||
y = iris["Species"]
|
||||
|
||||
X_train, X_test, y_train, y_test = train_test_split(
|
||||
X, y, test_size=0.5, random_state=42
|
||||
)
|
||||
|
||||
training_prediction = self.model.predict(X_train)
|
||||
test_prediction = self.model.predict(X_test)
|
||||
|
||||
print("Precision, Recall, Confusion matrix, in training\n")
|
||||
print(metrics.classification_report(y_train, training_prediction, digits=3))
|
||||
print(metrics.confusion_matrix(y_train, training_prediction))
|
||||
print("Precision, Recall, Confusion matrix, in testing\n")
|
||||
print(metrics.classification_report(y_test, test_prediction, digits=3))
|
||||
print(metrics.confusion_matrix(y_test, test_prediction))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
computation = Computation()
|
||||
if len(sys.argv) == 1:
|
||||
computation.compute()
|
||||
computation.save_result()
|
||||
elif len(sys.argv) == 4 and sys.argv[1] == "predict":
|
||||
computation.read_results_from_file(sys.argv[2])
|
||||
computation.predict(sys.argv[3])
|
||||
else:
|
||||
print("Invalid arguments")
|
||||
exit(1)
|
||||
|
||||
@@ -1,51 +0,0 @@
|
||||
import pandas as pd
|
||||
|
||||
from sklearn.model_selection import train_test_split
|
||||
from sklearn import metrics
|
||||
import joblib
|
||||
|
||||
import sys
|
||||
|
||||
import warnings
|
||||
warnings.filterwarnings("ignore", category=DeprecationWarning)
|
||||
warnings.filterwarnings("ignore", category=UserWarning)
|
||||
|
||||
csv_file_path = sys.argv[1]
|
||||
model_filename = sys.argv[2]
|
||||
|
||||
# Load the CSV file into a Pandas DataFrame
|
||||
iris = pd.read_csv(csv_file_path)
|
||||
|
||||
log_reg = joblib.load(model_filename)
|
||||
|
||||
# Now you have the Iris dataset loaded into the iris_df DataFrame
|
||||
print(iris.head()) # Display the first few rows of the DataFrame
|
||||
|
||||
# Droping the Species since we only need the measurements
|
||||
X = iris.drop(['Species'], axis=1)
|
||||
|
||||
# converting into numpy array and assigning petal length and petal width
|
||||
X = X.to_numpy()[:, (3,4)]
|
||||
y = iris['Species']
|
||||
|
||||
# Splitting into train and test
|
||||
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.5, random_state=42)
|
||||
|
||||
training_prediction = log_reg.predict(X_train)
|
||||
test_prediction = log_reg.predict(X_test)
|
||||
|
||||
print("Precision, Recall, Confusion matrix, in training\n")
|
||||
|
||||
# Precision Recall scores
|
||||
print(metrics.classification_report(y_train, training_prediction, digits=3))
|
||||
|
||||
# Confusion matrix
|
||||
print(metrics.confusion_matrix(y_train, training_prediction))
|
||||
|
||||
print("Precision, Recall, Confusion matrix, in testing\n")
|
||||
|
||||
# Precision Recall scores
|
||||
print(metrics.classification_report(y_test, test_prediction, digits=3))
|
||||
|
||||
# Confusion matrix
|
||||
print(metrics.confusion_matrix(y_test, test_prediction))
|
||||
Reference in New Issue
Block a user