mirror of
https://github.com/ultravioletrs/cocos.git
synced 2026-06-23 04:10:25 +00:00
main
102 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
6169766666 |
NOISSUE - Fix agent startup issues (#605)
CI / lint (push) Has been cancelled
CI / test (agent) (push) Has been cancelled
CI / test (cli) (push) Has been cancelled
CI / test (cmd) (push) Has been cancelled
CI / test (internal) (push) Has been cancelled
CI / test (manager, true) (push) Has been cancelled
CI / test (pkg) (push) Has been cancelled
CI / upload-coverage (push) Has been cancelled
* Update attestationFromCert function to include ccPlatform parameter for enhanced attestation processing Signed-off-by: Sammy Oina <sammyoina@gmail.com> * chore: migrate dependencies from supermq to magistrala and update build configurations Signed-off-by: Sammy Oina <sammyoina@gmail.com> * chore: update project dependencies, repository source, and support TDX QuoteV5 attestation Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
42b05524c8 |
NOISSUE - Implement structured logging with log forwarding for ingress-proxy and computation-runner, update component versions, and improve aTLS initialization and error handling. (#583)
CI / lint (push) Has been cancelled
CI / test (agent) (push) Has been cancelled
CI / test (cli) (push) Has been cancelled
CI / test (cmd) (push) Has been cancelled
CI / test (internal) (push) Has been cancelled
CI / test (manager, true) (push) Has been cancelled
CI / test (pkg) (push) Has been cancelled
CI / upload-coverage (push) Has been cancelled
* feat: Implement structured logging with log forwarding for `ingress-proxy` and `computation-runner`, update component versions, and improve aTLS initialization and error handling. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * refactor: Remove explicit AGENT_ENABLE_ATLS configuration and update component versions. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix: Correct aTLS nonce verification for truncated hashes, delegate internal CVM server TLS to Ingress Proxy, and update component versions. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * chore: Update package build sources to ultravioletrs/cocos main branch and remove local development keys and encrypted algorithm. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Remove the `pkg/server` module, including its generic gRPC and HTTP server implementations. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * chore: clarify nonce truncation in the certificate verifier. Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
c1cbcec851 |
COCOS-577 - Introduce Go-based CoRIM generation and deprecate Rust attestation policy scripts. (#578)
CI / lint (push) Has been cancelled
CI / test (agent) (push) Has been cancelled
CI / test (cli) (push) Has been cancelled
CI / test (cmd) (push) Has been cancelled
CI / test (internal) (push) Has been cancelled
CI / test (manager, true) (push) Has been cancelled
CI / test (pkg) (push) Has been cancelled
CI / upload-coverage (push) Has been cancelled
* feat: Introduce Go-based CoRIM generation and deprecate Rust attestation policy scripts. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * feat: Update dependencies and refactor attestation policy handling Signed-off-by: Sammy Oina <sammyoina@gmail.com> * refactor: Migrate attestation verification to use CoRIM and remove deprecated policy handling and EAT verification tests. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Removed the `tdx` and `sev-snp` attestation policy scripts and their build configurations, along with related build and installation steps from the main Makefile. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * chore: Remove Rust CI workflow and Cargo Dependabot configuration, and enhance Go test setup for attestation policy paths. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * refactor: Use WriteString instead of Write([]byte) for writing policy file content in test. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * feat: Refactor `ca-bundle` command to fetch bundles by product string using a configurable HTTP getter with improved error handling, and simplify `attestation_policy` command usage. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix: ignore return value of cmd.Help() Signed-off-by: Sammy Oina <sammyoina@gmail.com> * feat: Implement CoRIM generation for Azure and GCP attestation policies and add a CLI command to download and verify GCP OVMF files. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * feat: Upgrade Python virtual environment setup to include setuptools and wheel, append computation ID to Docker container names, and improve test robustness with error assertions and conditional skips for runtime tests. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * test: Enhance attestation verification tests, including CoRIM integration and specific platform types like Azure SNP, vTPM, TDX, and IGVM. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * feat: Add comprehensive test cases for `VerifyWithCoRIM` including success and measurement mismatch, and refine reference value validation. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * feat: Add Azure and TDX attestation verification tests and abstract external service dependencies for improved testability. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * feat: Add new test cases for Azure measurement extraction, EAT platform types, IGVM measurement stopping, vTPM CoRIM verification, and GCP OVMF download CLI. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * test: enhance CLI CoRIM generation and ATLS certificate verification tests, and refactor the Azure MAA client to use an interface. Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
0ffc2d17cf |
NOISSUE - Update cocos to match certs changes (#520)
CI / checkproto (push) Has been cancelled
CI / lint (push) Has been cancelled
CI / test (agent) (push) Has been cancelled
CI / test (cli) (push) Has been cancelled
CI / test (cmd) (push) Has been cancelled
CI / test (internal) (push) Has been cancelled
CI / test (manager, true) (push) Has been cancelled
CI / test (pkg) (push) Has been cancelled
CI / upload-coverage (push) Has been cancelled
* pass domain id to agent environment Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * update generated files Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * use certs sdk directly Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * remove redundant variables Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * use agent certs token for csr Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * update certs and add token to create req Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * fix atls Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * add agent token to certificate provider Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * pass certs token to agent Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * use sdk for csr Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * update atls Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * fix tests Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * address comments Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * remove unused structs Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * update tests Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * lint Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * fix tests Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * lint Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * remove unused domain id Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * refactor tests and remove unused struct fields Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * refactor(atls): remove CAClient and inline CA certificate issuance Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * lint' Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * increase coverage Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * fix bug in certs sdk and certificate provider Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * update certs Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * fix pkg stress Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> --------- Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> |
||
|
|
c758b3b216 |
NOISSUE - Refactor aTLS and gRPC server to use CertificateProvider interface (#522)
* Refactor ATLS and gRPC server to use CertificateProvider interface - Removed unused test cases and mock dependencies in atls_test.go. - Updated TestGetPlatformVerifier to use CertificateVerifier struct. - Introduced CertificateProvider interface for better abstraction in TLS handling. - Refactored gRPC server to accept CertificateProvider and configure TLS accordingly. - Simplified TLS configuration logic in both gRPC and HTTP servers. - Removed unnecessary parameters from server initialization in tests and main function. - Enhanced logging for TLS configurations. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Fix comments for consistency and clarity in atls.go Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Update expected error messages in VM command tests for clarity Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Enhance tests by integrating mock providers and improving error messages for clarity Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add comprehensive tests for certificate generation and attestation providers Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Implement certificate and attestation providers with unified generation logic Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Refactor certificate and attestation provider structures for consistency; implement CertificateVerifier interface and related methods Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Refactor attestation and certificate provider methods for consistency; rename methods and update related logic Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
906d7877b2 |
NOISSUE - Refactor http and grpc clients for reusability with Cube (#521)
* Implement gRPC server with TLS and mTLS support - Added gRPC server implementation in pkg/server/grpc. - Introduced server configuration options for TLS and mTLS. - Implemented health check service for gRPC. - Created tests for server initialization, startup, and shutdown scenarios. - Added mock server for testing purposes. - Implemented graceful shutdown handling for the server. - Included documentation for the server package. Signed-off-by: SammyOina <sammyoina@gmail.com> * Add TLS and ATLS support to gRPC and HTTP clients; refactor security handling Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor server configuration structure to use Config instead of BaseConfig Signed-off-by: SammyOina <sammyoina@gmail.com> * Fix comments for consistency and clarity in TLS-related code Signed-off-by: SammyOina <sammyoina@gmail.com> * Add comprehensive tests for TLS and ATLS configurations in clients package Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor file permission constants in client tests to use octal notation Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add tests for HTTP server's TLS configuration and lifecycle management Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add comprehensive tests for TLS certificate handling and configuration Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add comprehensive tests for HTTP client configuration and transport Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Refactor AttestationReportSize constant declaration for clarity Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Refactor client configuration structure and update gRPC client implementations - Consolidated client configuration types into a unified structure with BaseConfig. - Introduced AttestedClientConfig and StandardClientConfig for specific use cases. - Updated gRPC client creation functions to utilize new configuration types. - Refactored tests to align with the new configuration structure. - Removed redundant ClientConfiguration interface and related methods. - Simplified TLS configuration loading logic for both standard and attested clients. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Refactor client configuration structure and TLS handling - Introduced StandardClientConfig to replace BaseConfig, simplifying client configuration. - Updated AttestedClientConfig to embed StandardClientConfig instead of BaseConfig. - Modified ClientConfiguration interface to use Config() method instead of GetBaseConfig(). - Refactored various client tests to accommodate changes in configuration structure. - Added new TLS handling functions to support basic and attested TLS configurations. - Implemented comprehensive tests for TLS loading and configuration validation. - Removed deprecated methods and unnecessary code related to BaseConfig. Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
8eb1fac9ad |
NOISSUE - Refactor and update dependencies in the project (#491)
* Refactor and update dependencies in the project - Updated go.sum to replace `github.com/absmach/magistrala` with `github.com/absmach/supermq` across various modules. - Removed VSock configuration from environment variables and QEMU arguments. - Updated QEMU configuration and related tests to remove references to guest CID and VSock. - Added new HTTP transport layer for API endpoints in the manager. - Introduced Prometheus monitoring configuration with alert rules and Alertmanager setup. - Updated service and VM interfaces to remove unused methods and references. - Refactored tests to align with the new structure and dependencies. Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add MaxVMs configuration and enforce limit on VM creation Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add comprehensive tests for HTTP transport handlers and endpoints Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add test case for exceeding maximum number of VMs in TestRun Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Improve error handling in TestHandlerWithCustomRouter to ensure response writing is checked Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Update dependencies to latest versions - Upgrade cel.dev/expr from v0.23.0 to v0.24.0 - Upgrade github.com/absmach/supermq from v0.16.0 to v0.17.0 - Upgrade github.com/cenkalti/backoff from v4.3.0 to v5.0.2 - Upgrade github.com/cncf/xds/go to v0.0.0-20250501225837-2ac532fd4443 - Upgrade github.com/go-chi/chi/v5 from v5.2.1 to v5.2.2 - Upgrade github.com/go-jose/go-jose/v3 from v3.0.3 to v3.0.4 - Upgrade github.com/gofrs/uuid/v5 from v5.3.0 to v5.3.2 - Upgrade github.com/prometheus/client_golang from v1.22.0 to v1.23.0 - Upgrade github.com/prometheus/client_model from v0.6.1 to v0.6.2 - Upgrade github.com/prometheus/common from v0.62.0 to v0.65.0 - Upgrade github.com/prometheus/procfs from v0.15.1 to v0.16.1 - Upgrade go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp from v0.60.0 to v0.62.0 - Upgrade go.opentelemetry.io/otel/exporters/otlp/otlptrace from v1.36.0 to v1.37.0 - Upgrade golang.org/x/crypto from v0.39.0 to v0.40.0 - Upgrade golang.org/x/sys from v0.33.0 to v0.34.0 - Upgrade golang.org/x/text from v0.26.0 to v0.27.0 - Upgrade golang.org/x/time from v0.11.0 to v0.12.0 - Upgrade google.golang.org/grpc from v1.73.0 to v1.74.2 Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
45187d7f41 |
COCOS-454 - Implement graceful shutdown for services and add TTL management for VMs (#473)
* Implement graceful shutdown for services and add TTL management for VMs Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Remove unnecessary comment from go-tdx-guest dependency in go.mod Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Update manager/api/logging.go Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Add TTL manager initialization in TestStop Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Fix logging format in Shutdown method for consistency Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add unit tests for TTL manager functionality Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Enhance TTL tests with mutex for thread safety in expiration checks Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add TTL parameter to CreateVM in TestRun for improved testing scenarios Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add Shutdown test to verify VM cleanup and TTL manager integration Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> |
||
|
|
f543cb4363 |
COCOS-456 Remove SEV support from repo (#472)
* Remove SEV support from repo * Remove SEV references |
||
|
|
77325753f8 |
NOISSUE - Add TDX support to Manager (#446)
CI / ci (push) Has been cancelled
* Add TDX support on manager * Add functions to check platform * Search for tdx in kernel parameters * Modify based on comments |
||
|
|
5960b06126 |
NOISSUE - Include context parameter in New and main functions (#452)
CI / ci (push) Has been cancelled
Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
3102114ff3 |
COCOS-395 - Cloud Provider Firmware Integration (#415)
* add CC platform identification capability * add token verification * add snp azure * add azure snp report verification * fix linter errors * fix agent tests * expand the CC provider * fix azure atls * rebase branch * add nonce check for azure token * rename package attestations * remove alias attestations --------- Co-authored-by: Ubuntu <azureuser@UVCTestCVM.bu0p0zdolasezg1jifpyqhaxuc.dx.internal.cloudapp.net> |
||
|
|
ec306c72b9 |
COCOS-397 - Agent certificate generation via CA service (#410)
* Initial commit, will be tested before creating a PR * Initial commit, will be tested before creating a PR * Fixed all issues * Initial commit, will be tested before creating a PR * Updated agent docs * Fixed based on comments * Fixed based on comments * Initial commit, will be tested before creating a PR * Updated agent docs * Fixed based on comments * Fixed based on comments * added certificate verification * Initial commit, will be tested before creating a PR * Fixed all issues * Initial commit, will be tested before creating a PR * Initial commit, will be tested before creating a PR * Updated agent docs * Fixed based on comments * Fixed based on comments * added certificate verification * Fixed rebase errors * Fixed proto issues * fixed proto issues * Fixed format error * Fixed based on comments * NOISSUE - Simplify local agent running in non sev-snp environment (#411) * Add vtpm attestation support to agent service and server Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Update mockery version to v2.53.2 and refactor VM factory to include logger Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Send event notification when computation is stopped in agentService Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Remove redundant assignment of Stderr in qemuVM Start method Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Rename SVM references to CVM in tracing, logging, metrics, and service layers Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Bump github.com/docker/docker (#416) Bumps [github.com/docker/docker](https://github.com/docker/docker) from 28.0.1+incompatible to 28.0.4+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Commits](https://github.com/docker/docker/compare/v28.0.1...v28.0.4) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump google.golang.org/protobuf from 1.36.5 to 1.36.6 (#412) Bumps google.golang.org/protobuf from 1.36.5 to 1.36.6. --- updated-dependencies: - dependency-name: google.golang.org/protobuf dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * COCOS-393 - Disable SSH service and update user shell in cloud config (#396) * Disable SSH service and update user shell in cloud config Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Remove SSH server and clean up dependencies in cloud config Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add firewall configuration and ensure iptables rules persist after reboot Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Add algo_user configuration and setup script for container execution Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> * Initial commit, will be tested before creating a PR * Fixed all issues * Initial commit, will be tested before creating a PR * Initial commit, will be tested before creating a PR * Fixed based on comments * Fixed based on comments * added certificate verification * Initial commit, will be tested before creating a PR * Fixed all issues * Initial commit, will be tested before creating a PR * Initial commit, will be tested before creating a PR * Fixed based on comments * Fixed rebase errors * Fixed format error * Fixed based on comments * Fixed rebase errors --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Sammy Kerata Oina <44265300+SammyOina@users.noreply.github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> |
||
|
|
293c65a3aa |
NOISSUE - Add path to expected PCR values (#398)
* add path to expected PCR values * change rust scrtip for attestation policy to print policy to stdout * fix cli test * remove stdout from cmd config struct * fix manager test * fix manager readme |
||
|
|
1f32f516b0 |
NOISSUE - Simplify manager to vm provision only (#353)
* new agent structure Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix lint Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix tests Signed-off-by: Sammy Oina <sammyoina@gmail.com> * cvm tests fix Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix test Signed-off-by: Sammy Oina <sammyoina@gmail.com> * manager server, for vm provisioning Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix lint Signed-off-by: Sammy Oina <sammyoina@gmail.com> * add cli and test Signed-off-by: Sammy Oina <sammyoina@gmail.com> * restore result cli Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix tests Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix failing tests Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix failing test Signed-off-by: Sammy Oina <sammyoina@gmail.com> * refactor: remove context from docker struct and use local context in Run method Signed-off-by: Sammy Oina <sammyoina@gmail.com> * delete: remove unused gRPC API and related server implementation Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
ecad6514f3 |
COCOS-344 - New agent structure (#350)
* new agent structure Signed-off-by: Sammy Oina <sammyoina@gmail.com> * minor fixes and testing Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix lint Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix tests Signed-off-by: Sammy Oina <sammyoina@gmail.com> * cvm tests fix Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix test Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix cli test Signed-off-by: Sammy Oina <sammyoina@gmail.com> * rename Signed-off-by: Sammy Oina <sammyoina@gmail.com> * rename cvm to cvms plural Signed-off-by: Sammy Oina <sammyoina@gmail.com> * rename service Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix tests Signed-off-by: Sammy Oina <sammyoina@gmail.com> * remove context Signed-off-by: Sammy Oina <sammyoina@gmail.com> * refactor: reorder parameters in NewAlgorithm functions and update CVMClient to CVMSClient Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix(tests): update SendEvent mock to include an additional parameter Signed-off-by: Sammy Oina <sammyoina@gmail.com> * move expectations Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix(tests): move event initialization to the correct scope in service tests Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix(tests): update SendEvent mock to use EXPECT instead of On in service tests Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
ec426e58a2 |
COCOS - 209 - Restructure agent and manager gRPC config (#297)
* restructure grpc configs Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> enhance clients Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> restructure config Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> refactor Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> rebase Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> rebase Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> use separate configuration Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> fix tests Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> fix config Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> refactor Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> Lint Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> fix tests Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> add tests Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> add test case Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> add test case Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> refactor Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> further refactor' Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> add tests Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> rebase Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * remove redundant code Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * fix test Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> --------- Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> |
||
|
|
760c9bb580 |
NOISSUE - Rename backend info to attestation policy (#314)
* attestation policy field Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix tests Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fmt Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
d5941edb56 |
NOISSUE - Add information on OVMF verison, CPU type, CPU number and EOS version (#307)
* add manager OVMF verison information * add Manager documentation * add kernel cmd * add SVMInfo test |
||
|
|
d3636de824 |
NOISSUE - Remove blocking on vsock (#301)
* no blocking Signed-off-by: Sammy Oina <sammyoina@gmail.com> * return error Signed-off-by: Sammy Oina <sammyoina@gmail.com> * add test cases Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
fad3182638 |
NOISSUE - Refactor manager events and detangle service (#287)
* extract events service Signed-off-by: Sammy Oina <sammyoina@gmail.com> * major refactor and detangling Signed-off-by: Sammy Oina <sammyoina@gmail.com> * small fixes Signed-off-by: Sammy Oina <sammyoina@gmail.com> * handle tests better Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix lint Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix race condition Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix race Signed-off-by: Sammy Oina <sammyoina@gmail.com> * use plain interface Signed-off-by: Sammy Oina <sammyoina@gmail.com> * move mutex Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
18aa8ba785 |
NOISSUE - Add internal tests (#266)
* add internal tests Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix linter Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix race conditions Signed-off-by: Sammy Oina <sammyoina@gmail.com> * remove all races Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
46d24f928a |
NOISSUE - Add events for disconnection of agent (#233)
* add events Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix lint Signed-off-by: Sammy Oina <sammyoina@gmail.com> * typo Signed-off-by: Sammy Oina <sammyoina@gmail.com> * group logs Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix error Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix initialization of goroutine Signed-off-by: Sammy Oina <sammyoina@gmail.com> * add comment Signed-off-by: SammyOina <sammyoina@gmail.com> * update comment Signed-off-by: SammyOina <sammyoina@gmail.com> * fix lint Signed-off-by: SammyOina <sammyoina@gmail.com> * remove naked return Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
7ba34b93bc |
NOISSUE - Streamline message processing to prevent potential message loss (#228)
* fix dropping of message response from manager Signed-off-by: Sammy Oina <sammyoina@gmail.com> * remove change Signed-off-by: Sammy Oina <sammyoina@gmail.com> * simplify Signed-off-by: Sammy Oina <sammyoina@gmail.com> * add message send timeout Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
9ca045b06a |
COCOS-214 - Improve manager resiliance by tracking vms on restart (#219)
* track hanging vm processes Signed-off-by: SammyOina <sammyoina@gmail.com> * fix lint Signed-off-by: SammyOina <sammyoina@gmail.com> * fix run test Signed-off-by: SammyOina <sammyoina@gmail.com> * fix stop computation Signed-off-by: SammyOina <sammyoina@gmail.com> * shutdown gracefully Signed-off-by: SammyOina <sammyoina@gmail.com> * check if process still exists Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix lint Signed-off-by: Sammy Oina <sammyoina@gmail.com> * use const Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
e572793295 |
exit with error code (#225)
Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
5383f4465b |
NOISSUE - Exit on network failures only (#227)
Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
f906593492 |
remove tmp directory (#204)
Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
971e00aa68 |
NOISSUE - Select from port range (#196)
* select from port range Signed-off-by: Sammy Oina <sammyoina@gmail.com> * fix failing test Signed-off-by: Sammy Oina <sammyoina@gmail.com> * remove whitespace Signed-off-by: Sammy Oina <sammyoina@gmail.com> --------- Signed-off-by: Sammy Oina <sammyoina@gmail.com> |
||
|
|
9161d30683 |
PRISM-312 : Fetch Backend Information (#187)
* fetch backend info Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> WIP Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * add id to grpc response Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * read backend information Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> revert changes in test server Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * update info json Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * test on dell machine Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * update protoc Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> update protoc Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * refactor fetch backend info Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * remove computation definition Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * refactor manager service creation Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * refactor manager service creation: Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * return config to main Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * add tests on test/computation Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * update backend info path Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * use sudo Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * comment out sev testing section Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * update backend info json location Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * handle failed execution Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> * return error on failed execution: Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> --------- Signed-off-by: WashingtonKK <washingtonkigan@gmail.com> |
||
|
|
88707c4668 | NOISSUE - Fix Jaeger tracer (#177) | ||
|
|
dc16e8a997 |
NOISSUE - Use Magistrala Shared Packages (#174)
* refactor(env): remove internal env package No need for this package since the parent package github.com/caarlos0/env does everything we need Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * refactor(jaeger): remove internal jaeger package No need for this package since we can use magistrala exported package Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * refactor(metrics): remove internal metrics pkg Use exported magistrala prometheus package Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * chore(dep): Update grpc and other dependencies Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> * style(linter): remove enabled by default linters Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> --------- Signed-off-by: Rodney Osodo <socials@rodneyosodo.com> |
||
|
|
f4e3e8e09c |
COCOS-157 - Provide abstractions for VM management (#171)
* abstract vm creation and allow stopping computation Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor QEMU configuration loading and execution in main.go Signed-off-by: SammyOina <sammyoina@gmail.com> * * feat(agent-config): add support for sending agent configuration to manager Signed-off-by: SammyOina <sammyoina@gmail.com> * * chore(checkproto.yaml): update protoc-gen and protoc-grpc versions Signed-off-by: SammyOina <sammyoina@gmail.com> * * chore(auth): update mockery version to v2.43.2 * chore(main.go): update import path for vm package in agent * chore(main.go): update import path for vm package in manager * chore(go.mod): add github.com/google/logger v1.1.1 as a required dependency * chore(manager_test.go): update import path for vm package in manager * chore(logging.go): move logging.go to manager/qemu/vm package * chore(logging_test.go): move logging_test.go to manager/qemu/vm package * chore(vm_factory.go): rename vm_factory.go to provider.go in manager/qemu/vm/mocks package * chore(vm.go): move vm.go to manager/qemu/vm package * chore(vm.go): update import path for vm package in manager * chore(vm_test.go): move vm_test.go to manager/qemu/vm package * chore(vsock.go): move vsock.go to manager Signed-off-by: SammyOina <sammyoina@gmail.com> * * fix(main.go): change import path for 'github.com/ultravioletrs/cocos/manager/qemu/vm' to 'github.com/ultravioletrs/cocos/manager/vm' * fix(main.go): change vsock.Dial argument from 'vm.VsockConfigPort' to 'qemu.VsockConfigPort' * fix(main.go): change import path for 'github.com/ultravioletrs/cocos/manager/qemu' to 'github.com/ultravioletrs/cocos/manager/qemu' Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
4b5000d107 |
NOISSUE - Allow termination of manager (#95)
* Implement manager client heartbeat mechanism Introduced a heartbeat system in the manager service to maintain an active link with client agents. The updates involve sending periodic heartbeat signals controlled via the new environment variable `MANAGER_HEARTBEAT_INTERVAL`, defaulting to 1 second. The protobuf definitions, service interfaces, and server logic have been updated accordingly to handle the incoming signals. This change allows better tracking of active client connections and could be used for features like auto-reconnect or resource cleanup for lost connections in the future. Signed-off-by: SammyOina <sammyoina@gmail.com> * Enhance gRPC server to handle client termination Introduce the capability for the gRPC server in the manager module to process termination requests from clients. Server and client proto buffers have been updated to include a Termination message, allowing clients to signal the server to stop processing. Handling for communication and termination flows involving context cancellation have been added across server implementation, manager client, and main execution logic to properly respond to termination signals. This ensures clean shutdowns and resource deallocation when a client requests termination, improving the robustness of the system. Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor gRPC server context handling Removed the redundant storage of context in grpcServer struct and modified the NewServer constructor to exclude the context parameter. Updated the Process method to derive context from the stream directly, ensuring better context scoping and adherence to proper gRPC patterns. This change simplifies the server's context management and ensures contexts are more accurately associated with their corresponding streams, which can improve debugging and request cancellation behavior. Signed-off-by: SammyOina <sammyoina@gmail.com> * Update protoc to v4.25.3 and remove heartbeat system Upgraded protoc version to 4.25.3, ensuring compatibility with the latest protocol buffer features and potential performance improvements. Additionally, removed the entire heartbeat system, involving both its gRPC server implementation and associated client-side logic. This elimination suggests a shift in the service health-check strategy, potentially to a more modern or efficient model. Signed-off-by: SammyOina <sammyoina@gmail.com> * Removed heartbeat functionality for manager service The heartbeat feature and its related configuration have been removed across the application. This includes the removal of the Heartbeat method from the Service interface, the associated interval setting in the config structure, and related environment variable documentation. The change implies an architectural shift towards alternative methods of service availability checking or potentially a move to stateless service design. The cleanup removes unnecessary code, thus simplifying the overall service logic. Signed-off-by: SammyOina <sammyoina@gmail.com> * Introduce Terminate message and streamline gRPC server logic - Added a new `Terminate` protobuf message to handle termination requests more explicitly. - Removed redundant context cancels in gRPC server stream processing to rely on context propagation. - Updated channel usage to match the altered stream message types; this aligns message handling with the updated protobuf definitions. - Passed client authentication info to service run function to enhance the security during service execution. - Utilized stream's context directly in error groups for improved clarity and error handling scope. - Refactored message descriptors in protobuf to maintain correct ordering after introducing new message types. These changes enhance the codebase's maintainability, simplify the logic, and align with the updated messaging protocol. Signed-off-by: SammyOina <sammyoina@gmail.com> * Update agent configuration in main.go Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
2b760ec207 |
NOISSUE - Streamline client identification (#105)
* Update protoc to v4.25.3 and streamline client identification Protocol Buffer compiler `protoc` has been upgraded to version 4.25.3. This update unifies the version across multiple generated files to ensure compatibility and take advantage of any bug fixes and performance improvements in the new release. Additionally, the client identification process has been refined. The redundant `WhoAmI` message and the corresponding checks have been removed from the gRPC server implementation. Clients are now identified via their address from the `Process` stream's context as soon as a connection is established, simplifying the code and potentially reducing handshake time. This change sets the foundation for a leaner communication protocol between manager and agents, and could contribute to lower latencies in client-server interactions. Signed-off-by: SammyOina <sammyoina@gmail.com> * Fix goroutine bug in grpc server Signed-off-by: SammyOina <sammyoina@gmail.com> * Update PROTOC_VERSION to 25.3 Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor server reference in readme Updated the deployment instructions in the README to point to the correct server location after consolidating test server documentation. Removed obsolete `manager-server` module and associated main entry point to align with the new architecture and streamline workflows. Resolves issues with outdated links and cluttered repository structure. Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
722b463b6a |
NOISSUE - Use a single listener for logs and events (#82)
* add handler Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor gRPC and Protobuf integration for manager service - Shifted Protobuf message definitions to a separate package `pkg/manager`. - Updated references throughout the codebase to import and use the new package for gRPC service definitions. - Enhanced AgentLog message with additional fields `level` and `timestamp`. - Removed direct dependencies on old Protobuf-generated types in favor of the new package. - Deleted obsolete Protobuf-generated files as they are now superseded by the new `pkg/manager`. - Streamlined event publishing and gRPC handling in the manager service to use the updated Protobuf messages. This refactoring improves modularity by centralizing Protobuf message definitions and decouples internal representation from the gRPC interface, aligning with best practices for microservice architecture. Additionally, the enriched logging structure paves the way for more detailed and fine-grained log analysis. Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor vsock event/log handling and config Streamlined event and log services in the manager by moving vsock listening functions out of `managerService` initialization and into dedicated `RetrieveAgentEventsLogs` methods. This change decouples the manager service creation from the actual start of log listening, adding clarity and flexibility in service management. Also moved logging middleware invocation outside of network handling loops to avoid unnecessary overhead. Additionally, the agent's vsock port configuration is now dynamically passed to the `New` function in the `events` package instead of relying on a hardcoded constant, allowing for greater configurability and testability. Finally, updated message structures for event and log sending to conform with the `ClientStreamMessage` definitions. These modifications should improve parsing and handling consistency and prepare our system for future enhancements related to inter-process communication. Signed-off-by: SammyOina <sammyoina@gmail.com> * fix linting errors Signed-off-by: SammyOina <sammyoina@gmail.com> * correct path to generated files Signed-off-by: SammyOina <sammyoina@gmail.com> * fix comments Signed-off-by: SammyOina <sammyoina@gmail.com> * remove uneccessary comments Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
e2fb7ea88d |
NOISSUE - Add test server (#80)
* add test server Signed-off-by: SammyOina <sammyoina@gmail.com> * clean up and update docs Signed-off-by: SammyOina <sammyoina@gmail.com> * update docs Signed-off-by: SammyOina <sammyoina@gmail.com> * fix lint Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
6d0100c096 |
COCOS-73 - Remove server (#76)
* Refactor GRPC manager service and client The manager service and client have been restructured for stream communication, facilitating real-time agent events, logs, and run responses. The `Run` RPC is replaced by the `Process` stream RPC, enabling bidirectional streaming between clients and the manager service. This allows continuous interchange of different message types including `WhoAmIRequest`, `AgentLog`, `AgentEvent`, and `RunResponse`. Several message types have been adjusted and new fields introduced, like `AgentPort` in `RunResponse` and various agent-config attributes including CA files and instance IDs, to support TLS client authentication and distinguish between agent instances. We've also incorporated `google.protobuf.Timestamp` in `AgentEvent` for precise event logging. The client code reflects these modifications with updated method calls and stream handling logic for ongoing communication. Moreover, the updates necessitate corresponding changes throughout service, grpc, and sdk layers to interoperate with the new streaming approach. The transition to streaming paves the way for a more interactive, flexible communication system that can accommodate future expansion and real-time monitoring features. Signed-off-by: SammyOina <sammyoina@gmail.com> * add computation id to run response Signed-off-by: SammyOina <sammyoina@gmail.com> * rename request Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
8975e28437 |
COCOS-77 - Stream agent logs and events over gRPC (#78)
* Refactor GRPC manager service and client The manager service and client have been restructured for stream communication, facilitating real-time agent events, logs, and run responses. The `Run` RPC is replaced by the `Process` stream RPC, enabling bidirectional streaming between clients and the manager service. This allows continuous interchange of different message types including `WhoAmIRequest`, `AgentLog`, `AgentEvent`, and `RunResponse`. Several message types have been adjusted and new fields introduced, like `AgentPort` in `RunResponse` and various agent-config attributes including CA files and instance IDs, to support TLS client authentication and distinguish between agent instances. We've also incorporated `google.protobuf.Timestamp` in `AgentEvent` for precise event logging. The client code reflects these modifications with updated method calls and stream handling logic for ongoing communication. Moreover, the updates necessitate corresponding changes throughout service, grpc, and sdk layers to interoperate with the new streaming approach. The transition to streaming paves the way for a more interactive, flexible communication system that can accommodate future expansion and real-time monitoring features. Signed-off-by: SammyOina <sammyoina@gmail.com> * fix lint Signed-off-by: SammyOina <sammyoina@gmail.com> * Update GitHub Actions to Latest Versions Upgraded GitHub Actions 'checkout' to version 4 and 'setup-go' to version 5 across various workflow files to leverage the latest features and improvements for better performance and reliability. This also ensures compatibility with Go version 1.21.x which is specified in the workflows. Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor event handling and logging Reworked event and log processing to use channels instead of direct HTTP calls. Removed obsolete events package and consolidated event structures, leading to cleaner and more maintainable code. Updated agent events to use channels, enhanced error handling in log forwarding, and simplified manager `New` function signature to accept an event channel directly. - Removed `events` and `agentevents` packages to reduce complexity. - Replaced direct event server communication with internal channel usage. - Introduced `AgentEvent` struct in events.go for standardized event objects. - Adapted `managerService` to dispatch events and logs through channels. - Streamlined manager construction by removing the now-unnecessary event service and host IP parameters. This change results in a more robust and easier to extend event and log management system within the agent-manager interaction. Signed-off-by: SammyOina <sammyoina@gmail.com> * fix ci Signed-off-by: SammyOina <sammyoina@gmail.com> * remove unused code Signed-off-by: SammyOina <sammyoina@gmail.com> * add comments Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
046b549079 |
Remove HTTP server support and streamline config (#75)
The HTTP server-related code, documentation, and configurations have been removed as part of a shift towards prioritizing gRPC for service communication. This update includes deletions of HTTP host and port configs across various components, the manager HTTP API alongside its Swagger definition, and the removal of related scaffolding and utility code. This change simplifies the overall architecture and eliminates redundant HTTP support, focusing on optimizing gRPC performance and security features. Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
34c3bbdbd8 |
COCOS-55 - Pass agent logs to manager via vsock (#64)
* Simplify event handling and config Streamlined event service interface by consolidating `SendEvent` and introducing `SendRaw`. Removed `notification_server_url` and `instance_id` parameters from several event publication calls to leverage centralized event construction. This change not only cleans up redundancy in event-related code but also simplifies the configuration data flow across the system, making it easier to manage and less error-prone. Uniform event generation now improves consistency and maintainability. Refactored configuration management in the agent and manager services. Removed notifications URL from the agent configuration, relying on a simplification that assumes a single source of events. Updated Manager Port to VsockConfigPort for clarity and consistency across vsock communication. These modifications should facilitate easier integration and extension of event and configuration systems in the future. Signed-off-by: SammyOina <sammyoina@gmail.com> * fix lint Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor error handling in agent event forwarding Introduced context and error channel handling to the agent event forwarding process. The logger now warns on errors during forward operations asynchronously, allowing for non-blocking error reporting. Additionally, reliance on the global logger was removed in favor of passing error information via channels, improving modularity and error flow control. Resolves issue with silent forwarding failures by providing a means to alert system operators without halting the service. This enhancement makes the error reporting more robust and reactive while maintaining service continuity. Signed-off-by: SammyOina <sammyoina@gmail.com> * remove unused field Signed-off-by: SammyOina <sammyoina@gmail.com> * Enhance agent logging via vsock connection Redirected agent logging to use a vsock connection instead of standard output, improving the process isolation and enabling centralized log management. The change involved dialing to the specified vsock log port and initializing the logger with the vsock connection rather than stdout. Additionally, the manager service now maintains a map of agent vsock cids to computation IDs, providing better tracking of computation resources. A routine to retrieve logs from agents was also initiated during the service setup to facilitate log collection. Consequential to these changes is the removal of a redundant os package import in the agent's main.go, further cleaning up the dependencies. Signed-off-by: SammyOina <sammyoina@gmail.com> * fail gracefully Signed-off-by: SammyOina <sammyoina@gmail.com> * Updated backoff strategy and VM configurations - Added `github.com/cenkalti/backoff` to direct dependencies for robust retry logic in agent configuration sending. - Modified the vsock logs port to align with the updated port range standards. - Enclosed kernel console arguments in quotes to ensure proper parsing in QEMU configurations. - Implemented exponential backoff when sending agent configurations to handle transient failures. Refactors: - Streamlined creation of `AgentConfig` within the computation setup to avoid unnecessary initializations when `c.AgentConfig` is not nil. Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor command execution and improve argument construction Consolidated the error handling in the command execution function for better readability. In the QEMU configuration, the argument assembly process is enhanced for clarity and correctness; the VNC parameter is now separate, and string quoting is handled properly for kernel parameters. These changes result in more maintainable code and prevent potential formatting issues during QEMU argument parsing. Resolves issues with argument construction in QEMU config module. Signed-off-by: SammyOina <sammyoina@gmail.com> * Refine default config handling and unpacking Improved the agent configuration by dynamically setting default values for the log level and port if they are not specified in the incoming configuration. Also streamlined configuration unpacking in the endpoint and service layers, reducing redundancy and ensuring all required fields are correctly copied over to the Manager's configuration structure. This change ensures better fault tolerance and more maintainable code by handling edge cases where configuration values might be missing. Signed-off-by: SammyOina <sammyoina@gmail.com> * rename dir Signed-off-by: SammyOina <sammyoina@gmail.com> * fix lint Signed-off-by: SammyOina <sammyoina@gmail.com> * Ensure runRes.Empty() reflects non-empty state Changed the always-true return value of the `runRes.Empty()` method to `false` to accurately indicate the presence of a response body. This adjustment ensures downstream handling of API responses aligns with actual content state. Signed-off-by: SammyOina <sammyoina@gmail.com> * Replace mglog with slog across codebase Updated various components to replace the `mglog` logger implementation with the `slog` logger. This change affects logging initialization and calls throughout the codebase including the agent, manager, and internal server components. Transitioning to `slog` is part of a broader shift to standardize the logging mechanism to improve maintainability and consistency. Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
55afe4c038 |
COCOS-49 - Pass agent configuration and computation via vsock (#57)
* Optimize QEMU launch and add V-sock support Refactored QEMU argument construction and launching logic by removing the dependency on 'agent.Computation'. This simplification makes the VM creation process more streamlined. Additionally, introduced V-sock capabilities in the QEMU configuration to facilitate improved guest-host communication. Updated the README to include kernel module setup instructions for the new V-sock feature. The V-sock implementation enables VMs to use a consistent communication channel that is not affected by network configuration changes, enhancing reliability and potential interoperability with host services. It's important to ensure that the necessary kernel modules are loaded as part of the setup process, as documented. Signed-off-by: SammyOina <sammyoina@gmail.com> * Add vsock-based communication to manager Introduced virtual socket (vsock) communication abilities in the manager package by implementing a new socket service. This includes establishing a vsock listener and stub methods for sending computation results and cleaning up resources. The addition provides the groundwork for interprocess communication between guest and host in virtualized environments. - Integrated the `mdlayher/vsock` library for handling virtual socket operations. - Created a new `sockService` struct to encapsulate vsock listener handling. - Implemented `NewVsock` constructor to initialize the listener with domain value `3`. - Added placeholder methods for future computation sending and service closing logic. This enhancement targets scenarios where efficient VM-to-host communication is required. Signed-off-by: SammyOina <sammyoina@gmail.com> * remove env Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor agent config and use vsock Introduce `AgentConfig` struct to group agent-related configurations, and update `Computation` struct to include the new `AgentConfig` field. Replace command-line computation extraction with vsock-based config retrieval for robustness and decoupling. The agent configuration is now read from a vsock connection during runtime, allowing for more dynamic and flexible deployments. Adjusted the main agent application logic to support these configuration changes, and corresponding changes have been made in the manager to facilitate vsock communication. This approach aligns with modern practices for microservices by streamlining configuration management and reducing reliance on static command-line parameters. Moreover, it enhances the scalability of the agent service by allowing configuration to be managed externally. Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor agent config and remove deprecated code Consolidated agent configuration management into a single `AgentConfig` message and pruned deprecated Protobuf `ComputationReq`, `DatasetReq`, and `AlgorithmReq` messages. Adapted corresponding manager service logic to the new configuration structure. These modifications align with updated manager API schema, facilitate clearer configuration handling, and improve maintainability. Signed-off-by: SammyOina <sammyoina@gmail.com> * send configuration Signed-off-by: SammyOina <sammyoina@gmail.com> * Switch agent to listen mode for manager connections Previously, the agent established a connection to the manager using a direct dial. This change shifts the setup to where the agent listens on a specified port and accepts incoming connections. It ensures that the agent properly handles incoming requests by initiating a listening socket and waiting for the manager to connect, enhancing the system's flexibility in connection management. This adjustment also includes graceful closure of the listening socket. Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
a5e6cae92c |
COCOS-44 - Add agent address to run responses (#45)
* Add agent address to run responses The manager service's Run method now retrieves the agent address upon successful computation execution, providing more informative responses across gRPC, HTTP, and logging endpoints. This change improves service transparency by returning the agent's address to be used by client services, making the manager service's external communication more comprehensive. Updated the `RunResponse` structure in the corresponding protocol buffers definition and response handling in gRPC and HTTP APIs, ensuring that agent address information is serialized appropriately. It also necessitates a slight adjustment in the QEMU configuration to manage port forwarding rules more dynamically, simplifying the process as only agent-relevant ports are incremented with each new computation. This extra detail in responses aids in debugging and offers better integration capabilities for clients. Signed-off-by: SammyOina <sammyoina@gmail.com> * Allocate dynamic ports for VM guests and expand error handling Refactored service initialization to accept host IP and incorporated dynamic port allocation for VM guests, replacing the prior static increment method. Introduced a new error type 'ErrFailedToAllocatePort' to capture instances where the system is unable to find a free port. Integrated a third-party error package for improved error wrapping and context. These changes prevent port conflicts between VM guests and enhance error diagnostics for service operations. Resolves issue with static port allocation leading to conflicts. Signed-off-by: SammyOina <sammyoina@gmail.com> * Add HOST_IP to service configuration Extend service configuration to include the host machine's IP address, allowing instances to be aware of their deployment environment. This update passes the new HostIP field to the service constructor, ensuring the service can now operate with host-specific logic. Signed-off-by: SammyOina <sammyoina@gmail.com> * Populate AgentAddress in gRPC Responses Enhanced the gRPC encode/decode functions to properly populate the 'AgentAddress' field in 'RunResponse' objects. This ensures that consumers of the gRPC interface receive complete response data, which previously omitted the important 'AgentAddress' information. The change impacts both server-side response encoding and client-side response decoding, aligning the implementation with the expected interface contract. Signed-off-by: SammyOina <sammyoina@gmail.com> * fix ci Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
c29ef354fa |
COCOS-24 - Add events service via http (#27)
* add status endpoint Signed-off-by: sammy <sammyoina@gmail.com> * feat: Update code generation tools to latest versions Update the code generation tools, including protoc-gen-go and protoc-gen-go-grpc, to their latest versions (v1.31.0 and v1.3.0, respectively). This ensures compatibility with the latest features and improvements. The updated tools also require gRPC-Go v1.32.0 or later. The new versions bring important updates and bug fixes, enhancing the performance and stability of the generated code. By staying up-to-date with the latest tooling, we can take advantage of the latest functionality and ensure a smooth development experience. No code changes are included in this commit. These updates only impact the code generation process. Signed-off-by: sammy <sammyoina@gmail.com> * Update Go version to 1.21.x Signed-off-by: sammy <sammyoina@gmail.com> * Refactor agent and manager services to publish event notifications The refactoring includes changes to the agent and manager services to incorporate event notifications. By publishing events, the services can inform subscribers about the current state of the computation or any updates. Specifically, the `agentService` now includes a `cmpHash` field to store the SHA-256 hash of the computation, which is subsequently used when publishing events. The `agentService` and `managerService` now use the `publisher` interface to publish events to the topic "manager". Notably, the removed `pubsub.go` file is no longer necessary. This commit improves the service architecture by allowing subscribers to receive relevant updates and monitor the progress of computations. It enhances the overall system by providing more transparency and enabling better coordination between the agent and manager services. Signed-off-by: sammy <sammyoina@gmail.com> * Improve generated Go file comparison in checkproto workflow Refactor the file comparison logic in the checkproto workflow to use the `-p` flag instead of `-s` for improved accuracy. This change ensures that the generated Go files are thoroughly compared with the original ones, detecting any discrepancies and preventing out-of-sync files from passing the validation. By using the `-p` flag, we now check both the contents and the metadata of the files, providing more robust synchronization checks. This update enhances the reliability of the checkproto workflow and helps maintain consistency between the proto files and their corresponding generated Go files. Signed-off-by: sammy <sammyoina@gmail.com> * Update file comparison command to detect differences line by line The code change updates the file comparison command used in the CI workflow to detect differences line by line instead of only reporting the first difference encountered. This change improves the accuracy of detecting inconsistencies between the original protobuf files and the generated Go files. Previously, only the first difference was reported, leading to potential missed issues. By comparing the files line by line, we can now detect and report all differences accurately. This change enhances the reliability of our CI pipeline and ensures that the generated Go files stay in sync with the protobuf files. Signed-off-by: sammy <sammyoina@gmail.com> * add event exporting to external server Signed-off-by: sammy <sammyoina@gmail.com> * feat: Add support for notification server URL The commit adds a new environment variable, `COCOS_NOTIFICATION_SERVER_URL`, which allows specifying the server to receive notification events from the agent. This addition provides flexibility to configure the notification server URL based on the deployment environment. This change enables seamless integration with different notification server instances and enhances the extensibility of the system. It resolves the need to modify the code directly when changing the server URL. Signed-off-by: sammy <sammyoina@gmail.com> * Refactor gRPC client and server, remove unused handlers The commit refactors the gRPC client and server code by removing the unused `nopDecoder` and `status` handlers from the client and server, respectively. This cleanup reduces code clutter and improves maintenance. No significant consequences are expected. Signed-off-by: sammy <sammyoina@gmail.com> * Ensure generated Go files stay in sync with proto files during the CI workflow Fixes an issue in the CI workflow where proto files and their corresponding generated Go files were not being properly compared for synchronization. Previously, the `cmp -l` command was used, which only printed differing byte positions, leading to false negatives. This has been corrected by using `cmp -s` instead, which outputs nothing if the files are identical. This change ensures that any differences between the proto files and their generated Go files will be detected, helping to maintain consistency and accuracy in the codebase. Signed-off-by: sammy <sammyoina@gmail.com> * Enhance notification payload and endpoint Extended the notification system to include 'status' and 'details' in the payload, improving traceability and debugging. Adapted the serialized JSON structure for clarity and added an 'originator' field to track the source service. Transitioned to a generalized event endpoint, facilitating a more streamlined event handling process. Refactors POST request to a more appropriate endpoint and updates the notification service interface to reflect new payload requirements. Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor event notification logic Removed the legacy notifications package and consolidated event notification functionality using the new internal events service. Modified agent, manager, and main application code to use this service for consistent event reporting and error handling workflows across services. This change simplifies event management, improves error visibility, and allows for more maintainable code by centralizing event-related logic. The substitution of verbose state-specific publishEvent calls with generic status reporting aligns with the new service's capabilities. Signed-off-by: SammyOina <sammyoina@gmail.com> * Expand agent protobuf message types and improve error handling The protobuf definition for agent messages has been updated to include an additional message type, facilitating future data structure expansions. Additionally, error handling for event sending in the main agent execution has been enhanced to log errors when sending 'init' events fail, ensuring issues are properly tracked. The unused `notificationTopic` constant in the manager service has been removed for cleaner code maintenance. Signed-off-by: SammyOina <sammyoina@gmail.com> * Optimize JSON parsing and fix header omission Removed unnecessary unquoting of a JSON string before unmarshaling, streamlining the computation value extraction process. Also corrected a missing Content-Type header in the event sending function, ensuring proper handling of JSON requests by recipients. These changes improve performance and communication reliability. Signed-off-by: SammyOina <sammyoina@gmail.com> * align vars Signed-off-by: SammyOina <sammyoina@gmail.com> * Remove computation request timeout feature The timeout feature for computation requests has been removed to simplify the computation execution flow. This involved changes across multiple files, including protobuf definitions, HTTP endpoint handling, and the internal computation logic. We eliminated the timeout field, associated logic, and error handling to ensure the system no longer supports timeouts for computations, mitigating any unintended timeout impacts on long-running processes. Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: sammy <sammyoina@gmail.com> Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
4d9f7b1a45 |
COCOS-25 - Remove networking between manager and agent (#43)
* Enable passing computation details to QEMU args Refactored the QEMU argument construction to accept computation details as JSON. The changes include updating function signatures to pass an `agent.Computation` object and serialize it into the QEMU boot parameters. These modifications allow for custom computations to be specified per VM, facilitating more dynamic and configurable virtual machine creation. This update also integrates timing and data transformation necessary for the agent computation setup. Resolves computation configurability for virtual machines. Signed-off-by: SammyOina <sammyoina@gmail.com> * Enable extraction of computation value from cmdline Introduced a feature to parse the computation parameter directly from the machine's command line, thus enabling dynamic configuration of agent computations without the need for a preset environment. This change enhances the agent's ability to adapt to different computation requirements on startup. A fallback warning informs of cases where the computation parameter was not found, ensuring traceability of operational state during service initialization. Signed-off-by: SammyOina <sammyoina@gmail.com> * Remove RunRequest and RunResponse from Agent gRPC Refactor the Agent service by deleting the RunRequest and RunResponse messages along with the corresponding method implementations in the gRPC interface. This simplifies the gRPC protocol by removing an obsolete RPC method, streamlining communication between services. The deletion also impacts the manager service, which no longer uses the removed RPC method. Consequently, the service definitions across related files like endpoints, servers, and clients have been adjusted. This adjustment enforces a more direct approach for starting computations, reflecting a shift toward manager-initiated computation execution. Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
46a91ccbc3 |
remove nats (#41)
Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
a3c5c765b8 |
NOISSUE - Connect to agent when vm is created (#33)
* Add gRPC TLS config and update protoc versions Enhanced manager service with TLS configuration options and timeout settings for gRPC communication with agent services. Updated corresponding protobuf definitions to include new fields for CA certificates, TLS status, and timeouts. The added TLS support ensures secure inter-service communication while flexible timeout configurations improve the robustness of network interactions. The following adjustments and additions were made: - Included new gRPC configuration fields (ca_certs, client_tls, timeout) for agent communication in protobuf definitions. - Injected agent gRPC configuration into service endpoints, ensuring secure TLS setup and compliance with provided settings. - Revised main service function signatures to accept the gRPC configuration object. - Incremented port forwarding counters post-computation to avoid port conflicts. - Conducted compatibility update of protobuf version comments to reflect minor version bump (v4.25.0 to v4.25.1). This change impacts service deployment that requires proper configuration of TLS credentials and mindful determination of timeouts for efficient network use. Signed-off-by: SammyOina <sammyoina@gmail.com> * update proto Signed-off-by: SammyOina <sammyoina@gmail.com> * Set default timeout for agent configuration Introduced a default timeout of 60 seconds in both gRPC and HTTP endpoints for agent configuration when none is specified. This change ensures that operations do not hang indefinitely and provides a reasonable default for client interactions. Additionally, the instantiation of a new agent client is now outside the retry loop to avoid repeated setup on transient failures. Refactors service logic to optimize client connection handling by moving the agent client setup to occur before attempting retries, which should reduce overhead and improve clarity in error situations. Signed-off-by: SammyOina <sammyoina@gmail.com> * Subject: Move agent client creation into retry loop Body: Refactored the service manager's Run method to initialize the agent gRPC client inside the exponential backoff retry loop. This change addresses intermittent connection issues by reattempting client creation on temporary network failures, ensuring a robust setup before calling the Run method on the client. Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
18508796c1 |
add linters (#23)
- switchTrue
- importShadow
- httpNoBody
- paramTypeCombine
- emptyStringTest
- builtinShadow
- exposedSyncMutex
- importas
Signed-off-by: SammyOina <sammyoina@gmail.com>
|
||
|
|
0f831706f9 |
NOISSUE - Add CoCos-AI Manager API specification (#19)
* Add CoCos-AI Manager API specification This commit adds the CoCos-AI Manager API specification in the form of a YAML file. The specification includes information about the title, description, contact, license, and version of the API. It also defines the servers where the API is hosted and the paths and operations available, such as running computation on a virtual machine. The API specification is based on OpenAPI 3.0.1 and provides a clear and concise overview of the CoCos-AI Manager service. The commit also includes a link to the CoCos-AI repository and the license information. This commit is necessary to provide a clear and documented API specification for the CoCos-AI Manager service. Signed-off-by: SammyOina <sammyoina@gmail.com> * Refactor go.mod and go.sum files The go.mod and go.sum files have been refactored to remove the go.opentelemetry.io/contrib/propagators/jaeger package, which is no longer needed. This package was causing compatibility issues with the current version of the project. The refactoring ensures that the project is using the latest compatible versions of the required dependencies. This commit removes the go.opentelemetry.io/contrib/propagators/jaeger package from the go.mod file and updates the go.sum file accordingly. Note: The go.mod file now uses go.opentelemetry.io/otel v1.19.0 and go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.19.0. Please review the changes to ensure compatibility and functionality. Signed-off-by: SammyOina <sammyoina@gmail.com> * fix typo Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |
||
|
|
298878c96d |
NOISSUE - Rename module to cocos (#22)
* add stringer Signed-off-by: SammyOina <sammyoina@gmail.com> * rename module to cocos Signed-off-by: SammyOina <sammyoina@gmail.com> * Fix fmt.Stringer implementation in computations.go The fmt.Stringer implementation for Datasets and Algorithms in computations.go was fixed to correctly use pointers. This commit addresses the issue where the String() method for Datasets and Algorithms in computations.go was not correctly implemented. The fix ensures that the String() method now correctly marshals the data to JSON and returns the string representation. The changes made in this commit will improve the functionality and accuracy of the String() method for Datasets and Algorithms. Signed-off-by: SammyOina <sammyoina@gmail.com> --------- Signed-off-by: SammyOina <sammyoina@gmail.com> |