NOISSUE - Enable WASM Support and FileSystem Support (#189)

* feat(algorithm): Add wasm as an algo type Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * feat(algorithm): Use filesystem to store results Move from unix socket for results storage to filesystem * test: test new filesystem changes Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * refactor(files): rename resultFile to resultsFilePath * feat(wasm-runtime): change from wasmtime to wasmedge Wasmedge enables easier directory mapping to get results Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * feat(algorithm): send results as zipped directory Create a new function to zip the results directory and send it back to the user * fix(wasm): runtime argument Fix the directory mapping for wasm runtime arguments Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * fix(errors): provide useful error message * chore(gitignore): add results zip to gitignore * feat(filesystem): Enable storing results on filesystem for python algos * refactor: revert to upstream cocos repo Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * fix: remove AddDataset from algorithm interface * fix: agent to handle results zipping * test: test zipping directories * refactor(agent): Handle file operations from agent * test: run test inside eos Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * refactor(test): Document and test algos are running Document steps on running the 2 python exampls and ensure they are running on eos Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * fix: remove witheDataset option * test: test without dataset argument Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> --------- Signed-off-by: Rodney Osodo <socials@rodneyosodo.com>
2026-06-23 04:10:25 +00:00 · 2024-08-06 20:06:48 +03:00
parent 3c855e3b68
commit afc306a85b
23 changed files with 519 additions and 267 deletions
@@ -1,17 +1,105 @@
 # Algorithm

-Agent accepts binaries programs. To use the python program you need to bundle or compile it.
-In this example we'll use [pyinstaller](https://pypi.org/project/pyinstaller/)
+Agent accepts binaries programs, python scripts, and wasm files. It runs them in a sandboxed environment and returns the output.

-```shell
-pip install pandas scikit-learn
-pip install -U pyinstaller
-pyinstaller --onefile lin_reg.py
+## Python Example
+
+To test this examples work on your local machine, you need to install the following dependencies:
+
+```bash
+pip install -r requirements.txt
 ```

-Make the binary static:
+This can be done in a virtual environment.

-```shell
-pip install staticx
-staticx <dynamic_binary_file_path> <output_file_path> 
+```bash
+python -m venv venv
+source venv/bin/activate
+pip install -r requirements.txt
 ```
+
+To run the example, you can use the following command:
+
+```bash
+python3 test/manual/algo/addition.py
+```
+
+The addition example is a simple algorithm to demonstrate you can run an algorithm without any external dependencies and input arguments. It returns the sum of two numbers.
+
+```bash
+python3 test/manual/algo/lin_reg.py
+```
+
+The linear regression example is a more complex algorithm that requires external dependencies.It returns a linear regression model trained on the iris dataset found [here](../data/) for demonstration purposes.
+
+```bash
+python3 test/manual/algo/lin_reg.py predict result.zip  test/manual/data
+```
+
+This will make inference on the results of the linear regression model.
+
+To run the examples in the agent, you can use the following command:
+
+```bash
+go run ./test/computations/main.go ./test/manual/algo/lin_reg.py public.pem false ./test/manual/data/iris.csv
+```
+
+This command is run from the root directory of the project. This will start the computation server.
+
+In another window, you can run the following command:
+
+```bash
+sudo MANAGER_QEMU_SMP_MAXCPUS=4 MANAGER_GRPC_URL=localhost:7001 MANAGER_LOG_LEVEL=debug MANAGER_QEMU_USE_SUDO=false  MANAGER_QEMU_ENABLE_SEV=false MANAGER_QEMU_SEV_CBITPOS=51 MANAGER_QEMU_ENABLE_SEV_SNP=false MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/edk2/x64/OVMF_CODE.fd MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/edk2/x64/OVMF_VARS.fd go run main.go
+```
+
+This command is run from the [manager main directory](../../../cmd/manager/). This will start the manager. Make sure you have already built the [qemu image](../../../hal/linux/README.md).
+
+In another window, you can run the following command:
+
+```bash
+./build/cocos-cli algo ./test/manual/algo/lin_reg.py ./private.pem -a python -r ./test/manual/algo/requirements.txt
+```
+
+make sure you have built the cocos-cli. This will upload the algorithm and the requirements file.
+
+Next we need to upload the dataset
+
+```bash
+./build/cocos-cli data ./test/manual/data/iris.csv ./private.pem
+```
+
+After some time when the results are ready, you can run the following command to get the results:
+
+```bash
+./build/cocos-cli results ./private.pem
+```
+
+This will return the results of the algorithm.
+
+To make inference on the results, you can use the following command:
+
+```bash
+python3 test/manual/algo/lin_reg.py predict result.zip  test/manual/data
+```
+
+For addition example, you can use the following command:
+
+```bash
+go run ./test/computations/main.go ./test/manual/algo/addition.py public.pem false
+```
+
+```bash
+./build/cocos-cli algo ./test/manual/algo/addition.py ./private.pem -a python
+```
+
+```bash
+./build/cocos-cli results ./private.pem
+```
+
+## Wasm Example
+
+More information on how to run wasm files can be found [here](https://github.com/ultravioletrs/ai/tree/main/burn-algorithms).
+
+## Binary Example
+
+More information on how to run binary files can be found [here](https://github.com/ultravioletrs/ai/tree/main/burn-algorithms).
@@ -1,9 +1,14 @@
-import sys, io
-import joblib
-import socket
+import os
+import sys
+import zipfile
+
+RESULTS_DIR = "results"
+RESULTS_FILE = "result.txt"
+

 class Computation:
    result = 0
+
    def __init__(self):
        """
        Initializes a new instance of the Computation class.
@@ -16,45 +21,35 @@ class Computation:
        """
        self.result = a + b

-    def send_result(self, socket_path):
+    def save_result(self):
        """
-        Sends the result to a socket.
+        Sends the result to a file.
        """
-        buffer = io.BytesIO()
-        
        try:
-            joblib.dump(self.result, buffer)
-        except Exception as e:
-            print("Failed to dump the result to the buffer: ", e)
-            return
+            os.makedirs(RESULTS_DIR)
+        except FileExistsError:
+            pass

-        data = buffer.getvalue()
+        with open(RESULTS_DIR + os.sep + RESULTS_FILE, "w") as f:
+            f.write(str(self.result))

-        client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
-        try:
-            try:
-                client.connect(socket_path)
-            except Exception as e:
-                print("Failed to connect to the socket: ", e)
-                return
-            try:
-                client.send(data)
-            except Exception as e:
-                print("Failed to send data to the socket: ", e)
-                return
-        finally:
-            client.close()
-    
    def read_results_from_file(self, results_file):
        """
        Reads the results from a file.
        """
-        try:
-            results = joblib.load(results_file)
-            print("Results: ", results)
-        except Exception as e:
-            print("Failed to load results from file: ", e)
-            return
+        if results_file.endswith(".zip"):
+            try:
+                os.makedirs(RESULTS_DIR)
+            except FileExistsError:
+                pass
+            with zipfile.ZipFile(results_file, "r") as zip_ref:
+                zip_ref.extractall(RESULTS_DIR)
+            with open(RESULTS_FILE, "r") as f:
+                print(f.read())
+        else:
+            with open(results_file, "r") as f:
+                print(f.read())
+

 if __name__ == "__main__":
    a = 5
@@ -62,15 +57,10 @@ if __name__ == "__main__":
    computation = Computation()

    if len(sys.argv) == 1:
-        print("Please provide a socket path or a file path")
-        exit(1)
-    
-    if sys.argv[1] == "test" and len(sys.argv) == 3:
-        computation.read_results_from_file(sys.argv[2])
-    elif len(sys.argv) == 2:
        computation.compute(a, b)
-        computation.send_result(sys.argv[1])
+        computation.save_result()
+    elif len(sys.argv) == 3 and sys.argv[1] == "test":
+        computation.read_results_from_file(sys.argv[2])
    else:
        print("Invalid arguments")
        exit(1)
-
@@ -1,47 +1,116 @@
-import sys, io
+import os
+import sys
 import joblib
-import socket
-
 import pandas as pd
 from sklearn.model_selection import train_test_split
 from sklearn.linear_model import LogisticRegression
+import zipfile
+from sklearn import metrics

-csv_file_path = sys.argv[2]
-iris = pd.read_csv(csv_file_path)
+DATA_DIR = "datasets"
+RESULTS_DIR = "results"
+RESULTS_FILE = "model.bin"

-# Droping the Species since we only need the measurements
-X = iris.drop(['Species'], axis=1)

-# converting into numpy array and assigning petal length and petal width
-X = X.to_numpy()[:, (3,4)]
-y = iris['Species']
+class Computation:
+    model = None

-# Splitting into train and test
-X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.5, random_state=42)
+    def __init__(self):
+        """
+        Initializes a new instance of the Computation class.
+        """
+        pass

-log_reg = LogisticRegression()
-log_reg.fit(X_train,y_train)
+    def _read_csv(self, data_path=""):
+        """
+        Reads the CSV file.
+        """
+        files = os.listdir(data_path)
+        if len(files) != 1:
+            print("No files found in the directory")
+            exit(1)
+        csv_file_path = data_path + os.sep + files[0]
+        return pd.read_csv(csv_file_path)

-# Serialize the trained model to a byte buffer
-model_buffer = io.BytesIO()
-joblib.dump(log_reg, model_buffer)
+    def compute(self):
+        """
+        Trains a logistic regression model.
+        """
+        iris = self._read_csv(DATA_DIR)

-# Get the serialized model as a bytes object
-model_bytes = model_buffer.getvalue()
+        # Droping the Species since we only need the measurements
+        X = iris.drop(["Species"], axis=1)

-# Define the path for the Unix domain socket
-socket_path = sys.argv[1]
+        # converting into numpy array and assigning petal length and petal width
+        X = X.to_numpy()[:, (3, 4)]
+        y = iris["Species"]

-# Create a Unix domain socket client
-client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
+        X_train, _, y_train, _ = train_test_split(X, y, test_size=0.5, random_state=42)

-try:
-    # Connect to the server
-    client.connect(socket_path)
+        log_reg = LogisticRegression()
+        log_reg.fit(X_train, y_train)
+        self.model = log_reg

-    # Send the serialized model over the socket
-    client.send(model_bytes)
+    def save_result(self):
+        """
+        Sends the result to a file.
+        """
+        try:
+            os.makedirs(RESULTS_DIR)
+        except FileExistsError:
+            pass

-finally:
-    # Close the socket
-    client.close()
+        results_file = RESULTS_DIR + os.sep + RESULTS_FILE
+        joblib.dump(self.model, results_file)
+
+    def read_results_from_file(self, results_file):
+        """
+        Reads the results from a file.
+        """
+        if results_file.endswith(".zip"):
+            try:
+                os.makedirs(RESULTS_DIR)
+            except FileExistsError:
+                pass
+            with zipfile.ZipFile(results_file, "r") as zip_ref:
+                zip_ref.extractall(RESULTS_DIR)
+            self.model = joblib.load(RESULTS_DIR + os.sep + RESULTS_FILE)
+        else:
+            self.model = joblib.load(results_file)
+
+    def predict(self, data_path=""):
+        iris = self._read_csv(data_path)
+
+        # Droping the Species since we only need the measurements
+        X = iris.drop(["Species"], axis=1)
+
+        # converting into numpy array and assigning petal length and petal width
+        X = X.to_numpy()[:, (3, 4)]
+        y = iris["Species"]
+
+        X_train, X_test, y_train, y_test = train_test_split(
+            X, y, test_size=0.5, random_state=42
+        )
+
+        training_prediction = self.model.predict(X_train)
+        test_prediction = self.model.predict(X_test)
+
+        print("Precision, Recall, Confusion matrix, in training\n")
+        print(metrics.classification_report(y_train, training_prediction, digits=3))
+        print(metrics.confusion_matrix(y_train, training_prediction))
+        print("Precision, Recall, Confusion matrix, in testing\n")
+        print(metrics.classification_report(y_test, test_prediction, digits=3))
+        print(metrics.confusion_matrix(y_test, test_prediction))
+
+
+if __name__ == "__main__":
+    computation = Computation()
+    if len(sys.argv) == 1:
+        computation.compute()
+        computation.save_result()
+    elif len(sys.argv) == 4 and sys.argv[1] == "predict":
+        computation.read_results_from_file(sys.argv[2])
+        computation.predict(sys.argv[3])
+    else:
+        print("Invalid arguments")
+        exit(1)
@@ -1,51 +0,0 @@
-import pandas as pd
-
-from sklearn.model_selection import train_test_split
-from sklearn import metrics
-import joblib
-
-import sys
-
-import warnings
-warnings.filterwarnings("ignore", category=DeprecationWarning)
-warnings.filterwarnings("ignore", category=UserWarning)
-
-csv_file_path = sys.argv[1]
-model_filename = sys.argv[2]
-
-# Load the CSV file into a Pandas DataFrame
-iris = pd.read_csv(csv_file_path)
-
-log_reg = joblib.load(model_filename)
-
-# Now you have the Iris dataset loaded into the iris_df DataFrame
-print(iris.head())  # Display the first few rows of the DataFrame
-
-# Droping the Species since we only need the measurements
-X = iris.drop(['Species'], axis=1)
-
-# converting into numpy array and assigning petal length and petal width
-X = X.to_numpy()[:, (3,4)]
-y = iris['Species']
-
-# Splitting into train and test
-X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.5, random_state=42)
-
-training_prediction = log_reg.predict(X_train)
-test_prediction = log_reg.predict(X_test)
-
-print("Precision, Recall, Confusion matrix, in training\n")
-
-# Precision Recall scores
-print(metrics.classification_report(y_train, training_prediction, digits=3))
-
-# Confusion matrix
-print(metrics.confusion_matrix(y_train, training_prediction))
-
-print("Precision, Recall, Confusion matrix, in testing\n")
-
-# Precision Recall scores
-print(metrics.classification_report(y_test, test_prediction, digits=3))
-
-# Confusion matrix
-print(metrics.confusion_matrix(y_test, test_prediction))