28 Commits

Author SHA1 Message Date
Sammy Kerata Oina d5badba547 COCOS-584 - Support multiple kbs (#587)
CI / lint (push) Has been cancelled
CI / test (agent) (push) Has been cancelled
CI / test (cli) (push) Has been cancelled
CI / test (cmd) (push) Has been cancelled
CI / test (internal) (push) Has been cancelled
CI / test (manager, true) (push) Has been cancelled
CI / test (pkg) (push) Has been cancelled
CI / upload-coverage (push) Has been cancelled
* feat: Implement per-resource KBS configuration, allowing algorithms and datasets to specify individual KBS URLs.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: Encapsulate CLI error handling and CVM certificate paths within the CLI struct, and add algorithm type to agent's algorithm structure.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* style: Remove blank lines and fix indentation in CLI commands.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: Update downloadAndDecryptGenericResource to accept KBS URL as a parameter and adjust related tests

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: group CLI configuration into structured types and simplify skopeo decryption key handling

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2026-05-05 11:01:56 +02:00
Sammy Kerata Oina c59a413765 NOISSUE - Implement extensible resource downloader framework with support for S3, GCS, and OCI sources (#590)
CI / lint (push) Has been cancelled
CI / test (agent) (push) Has been cancelled
CI / test (cli) (push) Has been cancelled
CI / test (cmd) (push) Has been cancelled
CI / test (internal) (push) Has been cancelled
CI / test (manager, true) (push) Has been cancelled
CI / test (pkg) (push) Has been cancelled
CI / upload-coverage (push) Has been cancelled
* feat: implement extensible resource downloader framework with support for S3, GCS, and OCI sources

Signed-off-by: SammyOina <sammyoina@gmail.com>

* refactor: improve resource URL parsing and add support for bare OCI image references

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix: add empty string check and slash requirement for OCI image inference, and update python unit tests with event mock expectations

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: introduce OCIClient interface, add test coverage for decryption, and improve resource download error handling

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* chore: remove trailing whitespace in OCI downloader and HTTP tests

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2026-04-28 11:21:03 +02:00
Sammy Kerata Oina da31d76c94 NOISSUE - Agent Pull mode for remote resources (#575)
CI / checkproto (push) Has been cancelled
CI / lint (push) Has been cancelled
Rust CI Pipeline / rust-check (push) Has been cancelled
CI / test (agent) (push) Has been cancelled
CI / test (cli) (push) Has been cancelled
CI / test (cmd) (push) Has been cancelled
CI / test (internal) (push) Has been cancelled
CI / test (manager, true) (push) Has been cancelled
CI / test (pkg) (push) Has been cancelled
CI / upload-coverage (push) Has been cancelled
* feat(kbs): implement KBS client for attestation and resource retrieval

- Added KBS client implementation in pkg/kbs/client.go with methods for attestation and resource retrieval.
- Introduced necessary data structures for requests and responses.
- Implemented error handling for various scenarios.

test(kbs): add unit tests for KBS client

- Created comprehensive tests for the KBS client in pkg/kbs/client_test.go.
- Included tests for attestation success and failure cases, as well as resource retrieval.

feat(registry): introduce HTTP and S3 registry implementations

- Added HTTPRegistry for downloading resources over HTTP/HTTPS with retry logic in pkg/registry/http.go.
- Implemented S3Registry for downloading resources from AWS S3 and S3-compatible services in pkg/registry/s3.go.
- Included error handling and configuration options for both registries.

chore(registry): define registry interface and configuration

- Created registry interface and configuration struct in pkg/registry/registry.go.
- Added default configuration settings for registry clients.

docs(cvms): update README for CVMS server configuration and usage

- Enhanced documentation for CVMS server with detailed command-line flags and usage examples.
- Clarified direct upload and remote resource modes, including KBS integration.

fix(cvms): integrate KBS for remote resource handling in main.go

- Updated main.go to support remote datasets and algorithms using KBS.
- Added validation for command-line flags to ensure proper configuration.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix: Move ifeq conditional outside define block in attestation-service.mk

Make conditionals cannot be evaluated inside define...endef blocks
when used as recipe bodies. Restructured to define the
ATTESTATION_SERVICE_INSTALL_INIT_SYSTEMD block conditionally based
on BR2_PACKAGE_CC_ATTESTATION_AGENT configuration.

* feat: Implement remote resource downloading for algorithms and datasets using AWS S3/MinIO credentials.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Add comprehensive documentation and agent support for testing remote resource download with KBS attestation.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Improve agent logging for remote resource configuration and KBS status, and add a testing guide for remote resource downloads with KBS attestation.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Add a comprehensive guide for testing remote resource download with KBS attestation and update multiple package versions to a specific commit.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Add failure transitions for resource reception states and a comprehensive guide for testing remote resource downloads with KBS attestation.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Implement remote resource download with KBS attestation in the agent and add a comprehensive testing guide.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* test: Add comprehensive guide for testing remote resource download with KBS attestation and include a debug log in the attestation client.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Delegate KBS attestation and token retrieval to a new attestation-agent service and document remote resource testing.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* client fixes

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* raw evidence

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix: Build all Go files in cmd directories, not just main.go

This fixes the issue where fetch_raw_evidence.go wasn't being included
in the attestation-service build.

* fix: Wrap binary evidence in JSON for KBS compatibility

Fixes 'invalid character' error by wrapping raw binary evidence
in a JSON structure with base64 encoding, as expected by KBS.

* chore: Update buildroot packages to c28cefae

Includes fixes for:
1. attestation-service build (including fetch_raw_evidence.go)
2. Agent KBS evidence format (wrapping binary in JSON)

* fix: Implement KBS RCAR handshake with cookies

Fixes 'cookie not found' error (401) from KBS by:
1. Adding CookieJar support to KBS client
2. Implementing GetChallenge() to perform /auth handshake and capture session cookie
3. Updating Agent to get challenge, decode nonce, and use it for evidence generation
4. Regenerating mocks

* chore: Update buildroot packages to f6981ac5

Includes KBS RCAR handshake fix (cookie support + GetChallenge loop)

* fix: Update KBS client JSON tags to kebab-case

Fixes deserialization error (401) from KBS by:
1. Using kebab-case (e.g. extra-params) for JSON tags as per protocol.
2. Initializing ExtraParams as empty object {} instead of null/omitted.

* fix: Wrap attestation evidence in primary_evidence format

Updates Agent to construct 'tee-evidence' payload with:
- primary_evidence: containing the actual quote/data
- additional_evidence: empty JSON object

This matches the Confidential Containers KBS Attestation Protocol requirements.

* fix: Update KBS protocol version to 0.4.0

KBS rejected 0.1.0 with a version mismatch error. Bumping to 0.4.0 to match server expectation.

* fix: Generate ephemeral key for KBS RuntimeData

Updates RuntimeData to include a valid ephemeral EC P-256 public key in JWK format, as required by the KBS RCAR protocol.
Also fixes the KBS client struct to support TEEPubKey as an object.

* fix: Update sample attestation quote to valid JSON

The default attestation.bin was binary, but the KBS Sample Verifier expects a valid JSON quote containing 'svn' and 'report_data'.
Updated the embedded bin file to contain this JSON structure.

* fix: Generate dynamic JSON quote for Sample TEE in FetchRawEvidence

The KBS Sample Verifier expects a JSON object with 'svn' and 'report_data'.
Previously, we were returning raw binary data (reportData+nonce).
This commit updates FetchRawEvidence to return a marshaled JSON structure with:
- svn: "1"
- report_data: base64(req.ReportData)

* refactor: Delegate Sample Attestation to Provider

Refactored sample attestation logic:
- Moved JSON Quote generation into EmptyProvider (standalone mode).
- Updated FetchRawEvidence to call provider.TeeAttestation instead of manual generation.
This enables using the real CC Attestation Agent for UNSPECIFIED platform if configured.

* feat: Add comprehensive debug logging and enforce CC AA usage

Changes:
- Updated EmptyProvider to return error instead of generating mock data
  This forces proper use of CC Attestation Agent's sample attester
- Added detailed logging to attestation-service FetchRawEvidence:
  * Hex dump of evidence (first 200 bytes)
  * String preview of evidence
  * Total evidence length
- Added detailed logging to agent service:
  * Raw evidence hex and string previews
  * KBS evidence JSON preview (first 500 bytes)
  * Evidence lengths at each transformation step

This logging will help diagnose why KBS Sample Verifier is rejecting evidence.

* fix: Enable CC AA by default and add attestation-service log forwarding

Changes:
- Set USE_CC_ATTESTATION_AGENT=true by default in systemd service
- Added StandardOutput/StandardError to forward logs to /var/log/cocos/
- Updated HAL makefile to handle new default value
- This ensures attestation-service uses CC AA's sample attester
- Logs will now be visible in CVMS output for debugging

* feat: Add gRPC log forwarding to attestation-service

Implemented the same log forwarding mechanism used by the agent:
- Added ProtoHandler to write logs to both stdout and logQueue
- Connected to log client (/run/cocos/log.sock) for gRPC forwarding
- Added goroutine to forward logs to CVMS via log client
- Logs will now appear in CVMS output during computation runs

This enables visibility into attestation-service debug output including:
- CC AA connection status
- Evidence generation details (hex dumps, string previews)
- Any errors from providers

* fix: Parse sample evidence JSON instead of base64-encoding it

The attestation-service returns sample evidence as JSON:
{"svn":"1","report_data":"base64..."}

The agent was incorrectly base64-encoding this JSON string again.
KBS Sample Verifier expects the parsed JSON object directly.

Fixed by:
- Parsing the JSON evidence from attestation-service
- Passing the parsed object directly in primary_evidence.evidence
- This matches what KBS Sample Verifier expects

* debug: Increase KBS evidence logging preview to 1000 bytes

Show the complete JSON structure being sent to KBS to debug
the attestation failure.

* debug: Add comprehensive CC AA configuration logging

Added debug logs to show:
- Whether CC AA is enabled in config
- CC AA address being used
- Connection success/failure
- Which provider is ultimately selected
- Warning when falling back to EmptyProvider

This will help diagnose why EmptyProvider is being used
instead of CC Attestation Agent.

* debug: Add startup logging for log client connection

Added log message to show if log client connection succeeds
at attestation-service startup. This will help diagnose why
logs aren't appearing in CVMS output.

* feat: Add retry logic with exponential backoff to log client

Added simple retry mechanism to handle concurrent log requests:
- 3 retry attempts with exponential backoff (10ms, 20ms, 40ms)
- Applies to both SendLog and SendEvent methods
- Centralized in log client so all services benefit
- Should eliminate 'failed to send log' errors from concurrent requests

This fixes the issue where attestation-service logs weren't
appearing in CVMS output due to dropped messages.

* fix: Flatten sample evidence fields in primary_evidence for KBS

KBS Sample Verifier expects svn and report_data at the top level
of primary_evidence, not nested under an 'evidence' key.

Changed structure from:
{"primary_evidence": {"tee": "sample", "evidence": {"svn": "1", ...}}}

To:
{"primary_evidence": {"tee": "sample", "svn": "1", "report_data": "...", ...}}

This matches what KBS expects when deserializing the Quote structure.

* fix: Use sample quote directly as primary_evidence per KBS protocol

According to KBS attestation protocol spec, for sample TEE type,
primary_evidence should be the sample quote JSON directly:
{"svn": "1", "report_data": "..."}

Removed extra 'tee' and 'platform' fields that were causing KBS
to fail deserializing the Quote structure. The 'tee' field is
already sent in the Request payload during RCAR handshake.

Refs:
- https://github.com/confidential-containers/trustee/blob/main/kbs/docs/kbs_attestation_protocol.md
- https://github.com/confidential-containers/guest-components/blob/main/attestation-agent/attester/src/sample/mod.rs

* fix: Make CC AA required for sample attestation when configured

When USE_CC_ATTESTATION_AGENT=true, attestation-service now
requires AA to be available for NoCC/sample platform. This ensures
sample evidence always comes from AA with the correct KBS format.

Changes:
- Error out if AA connection fails for NoCC platform when AA is configured
- Only use EmptyProvider if AA is explicitly NOT configured
- Prevents incorrect sample evidence format from EmptyProvider

This ensures attestation-service delegates to AA for sample evidence
generation instead of creating it itself.

* fix: Implement proper RCAR protocol with tee-pubkey and runtime-data hash

Fixed KBS attestation error 'REPORT_DATA is different from that in Sample Quote'

Changes:
1. Generate ephemeral EC key pair BEFORE getting evidence from AA
2. Create runtime-data with nonce + tee-pubkey (JWK format)
3. Hash runtime-data (SHA-256) and use as report_data for AA
4. This binds the tee-pubkey to the TEE evidence per RCAR protocol

The report_data in the evidence now matches what KBS expects:
hash(runtime-data) instead of computation ID.

This completes the full RCAR protocol implementation:
- Request → Challenge → Attestation (with bound tee-pubkey) → Response

* fix(agent): use simple nonce for Sample attestation report_data

For Sample/NoCC attestation, use the raw nonce bytes directly as
report_data instead of hashing runtime-data. This avoids JSON
serialization mismatches with the KBS Sample verifier.

Real TEEs (TDX/SNP) still use runtime-data hash binding to
cryptographically bind the ephemeral tee-pubkey to the evidence.

* fix(agent): use RFC 8785 canonical JSON for runtime-data hashing

The KBS Sample attestation verifier (and likely others) expects the
report_data to be the SHA-256 hash of the *canonical* JSON serialization
(RFC 8785) of the runtime-data. Standard Go JSON marshaling does not
guarantee key ordering, leading to hash mismatches.

This change uses github.com/gowebpki/jcs to canonicalize the runtime-data
before hashing, ensuring compatibility with the KBS RCAR implementation.
Also reverted the temporary 'simple nonce' workaround.

* feat(hal): add CoCo Keyprovider and Skopeo packages

- Add coco-keyprovider buildroot package with systemd service
- Add skopeo buildroot package for OCI image handling
- Add ocicrypt_keyprovider.conf for encrypted image decryption
- Update Config.in to include new packages

This enables standard CoCo ecosystem integration for encrypted
OCI images instead of custom S3/HTTP registry clients.

* feat(oci): add OCI image handling package with Skopeo integration

- Add pkg/oci/types.go with ResourceSource and ImageManifest types
- Add pkg/oci/skopeo.go with Skopeo wrapper for pull/decrypt
- Add pkg/oci/extract.go for extracting algorithms and datasets from layers

This package provides OCI image handling using Skopeo and CoCo
Keyprovider for encrypted image decryption, replacing custom
S3/HTTP registry clients.

* chore: regenerate protobuf files for updated cvms.proto

* refactor(agent): replace S3/HTTP/KBS with OCI package

- Remove pkg/kbs and pkg/registry imports
- Add pkg/oci import for OCI image handling
- Replace downloadAndDecryptResource with OCI-based implementation
- Use Skopeo + CoCo Keyprovider for automatic decryption
- Reduce code from ~240 lines to ~70 lines

This eliminates custom KBS RCAR handshake, S3/HTTP registry clients,
and manual decryption logic. CoCo Keyprovider handles all decryption
automatically via ocicrypt protocol.

* chore: remove obsolete pkg/kbs and pkg/registry packages

- Delete pkg/kbs/ (custom KBS client, ~300 lines)
- Delete pkg/registry/ (S3/HTTP registry clients, ~400 lines)
- Remove unused imports from agent/service.go
- Run go mod tidy to clean up dependencies

These packages have been replaced by pkg/oci with Skopeo and
CoCo Keyprovider for standard CoCo ecosystem integration.

* fix(agent): update ResourceSource struct to include type and encryption fields

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix(hal): update CoCo Keyprovider to v0.16.0 and fix build path

- Update version from v0.11.0 to v0.16.0 (matches attestation agent)
- Fix install path: target is at repo root, not in coco_keyprovider subdir
- This fixes the build error where coco_keyprovider binary wasn't found

The cargo workspace in guest-components builds to a shared target/
directory at the repository root, not within each crate's subdirectory.

* feat: Update remote resources testing guide to use kbs-client and coco-keyprovider for key management and encryption, enable insecure TLS for Skopeo, and enhance CVMS with

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Update component versions, revise image encryption documentation, and sanitize OCI image paths for Skopeo compatibility.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Add `decompress` option to Dataset and `algo_type`/`algo_args` to Algorithm protobuf messages, updating client, test, and build configurations.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Update multiple package versions and enhance OCI image extraction error reporting for missing algorithm files.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* chore: Bump package versions, improve OCI image extraction debugging by returning seen files, and remove unused dataset type parsing from test code.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: Migrate OCI extraction to use structured logging with `slog` and `context`, and update package versions.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Bump multiple component versions, add encrypted status for computation inputs and algorithms, and refine OCI layer extraction warnings.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* logging

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Add `Encrypted` field to algorithm and dataset resource sources and update all component versions.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: update component versions, integrate coco-keyprovider service, and configure ocicrypt key provider.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: add support for KBS parameters and dataset/algorithm hash calculations in CVMS

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: update resource download and extraction logic to support requirements.txt and improve hash verification

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* chore: Update dependencies, improve code style, and add GetRawEvidence to attestation client mocks.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Refactor code structure for improved readability and maintainability

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix: update golangci configuration to include errcheck for build path and remove unnecessary exclusions

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix: streamline kernel command line handling in QEMU args construction

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: add attestation binary and update checksum tests and policy structure

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Add unit tests for attestation agent, attestation, log, crypto, OCI, and Skopeo clients

- Implement tests for the attestation agent client including Unix socket and TCP address handling, token retrieval, and error scenarios.
- Enhance attestation client tests to cover fetching raw evidence for various platforms (SNP, TDX, VTPM, SNPvTPM) and validate error handling.
- Introduce log client tests to verify retry behavior for sending logs and events.
- Create comprehensive tests for crypto package focusing on AES-GCM decryption, encrypted resource parsing, and key unwrapping.
- Add tests for OCI package to validate algorithm and dataset extraction, including JSON serialization of OCILayout.
- Implement Skopeo client tests to ensure proper functionality for image pulling, inspecting, and resource source handling.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix: handle JSON marshal errors in test cases for decrypt and extract functions

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* test: add comprehensive tests for algorithm and dataset extraction with various scenarios

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: replace hardcoded Python script content with constant variable

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix: remove redundant mock expectation for SendAgentConfig in TestCreateVMWithAaKbsParams

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* test: add tests for event sending failure, dataset extraction with path traversal, and Skopeo client behavior

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* test: add tests for download and decryption of resources with various URL formats

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: Introduce OCIClient interface for agent service to improve testability of OCI image operations and enhance related tests.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: Change `get_uint64_from_tcb` to accept `TcbVersion` by value and use `u64::from` for type conversions.

---------

Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2026-03-16 14:48:55 +01:00
Sammy Kerata Oina a3265bc346 NOISSUE - Introduce computation runner, log forwarder, ingress, and egress proxy services. (#559)
* feat: Introduce computation runner, log forwarder, ingress, and egress proxy services.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Update Go environment variable parsing and build system to use new architecture and repository.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Update package sources to `sammyoina/cocos-ai` at a specific commit, add log-forwarder pre-start hook, and rename proxy binaries.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* chore: Update build system references to a specific commit and enhance logging for service connections and message processing.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* build: Update package source repositories and versions, migrate client logging to slog, and adjust ingress/egress proxy build and install steps.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* debug stuck

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* debug

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* debug

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: add HTTP/2 support to egress proxy and update build system to use specific commit hashes

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: enhance egress proxy CONNECT handling, update package sources, and add gRPC test utility

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Update build system for various services to a specific commit from a new repository, change agent gRPC port to 7001, and add a gRPC test client.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Migrate agent-internal gRPC communication to Unix sockets, set ingress proxy to port 7002, and update build hashes.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: Remove standalone ingress-proxy systemd service and update component versions.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix: Prevent computation re-initialization in agent and update component versions across several packages.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: update package versions and enable h2c support in ingress proxy.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: refactor ingress proxy to support HTTP/2 over Unix sockets and update component versions.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: Update build system package sources to `ultravioletrs/cocos` and reduce agent logging verbosity.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: improve error handling in proxy commands and remove unused gRPC test

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* test: add mock service state return value in handleRunReqChunks test

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* feat: add comprehensive tests for service and proxy components

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix linter

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* improve coverage

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* test: add gRPC client and ingress adapter tests, and update egress proxy tests.

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* improve coverage

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2026-02-09 10:38:21 +01:00
Sammy Kerata Oina c14f1d7b6c COCOS-391- GCP Attestation policy (#405)
* Add AgentGrpcHost configuration to agent server

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Add SHA1 support to PcrValues and implement GCP attestation functions

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Add GCP attestation policy and OVMF download commands

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Add vTPM attestation support and update protobuf versions

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Remove Host field from AgentConfig and update related references

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Update GCP attestation policy to accept vCPU count as an argument

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Add SHA512 digest verification for OVMF file in GCP download command

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Update OVMF object name format in GCP attestation package

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* Refactor attestation policy structure to use nested Config field

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2025-03-19 09:39:46 +01:00
Sammy Kerata Oina ecad6514f3 COCOS-344 - New agent structure (#350)
CI / checkproto (push) Has been cancelled
CI / ci (push) Has been cancelled
* new agent structure

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* minor fixes and testing

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix lint

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix tests

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* cvm tests fix

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix test

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix cli test

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* rename

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* rename cvm to cvms plural

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* rename service

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix tests

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* remove context

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* refactor: reorder parameters in NewAlgorithm functions and update CVMClient to CVMSClient

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix(tests): update SendEvent mock to include an additional parameter

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* move expectations

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix(tests): move event initialization to the correct scope in service tests

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix(tests): update SendEvent mock to use EXPECT instead of On in service tests

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2025-01-17 12:50:53 +01:00
Sammy Kerata Oina bceb1727d1 NOISSUE - Use Mockery config (#323)
Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2024-11-28 15:11:13 +01:00
Sammy Kerata Oina 20e7ea76e0 NOISSUE - Improve pkg tests (#286)
* add pkg tests

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix lint

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* expect errors

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix assertions

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix test cases

Signed-off-by: SammyOina <sammyoina@gmail.com>

* coverage files

Signed-off-by: SammyOina <sammyoina@gmail.com>

* add more test cases

Signed-off-by: SammyOina <sammyoina@gmail.com>

* improve tests

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* update test descriptions

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: Sammy Oina <sammyoina@gmail.com>
Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-10-30 16:04:30 +01:00
Sammy Kerata Oina 184617da9e NOISSUE - Host data verification (#275)
* host data verification

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* update mocks

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix test

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* debug host data

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* debug

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* check device

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* imorove test

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* missing header

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* update embed option

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* minor fixes

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* add tests

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix lint

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* update deps

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* each case is unique

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* all files

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix coverage

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* resolve comments

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* improve coverage

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* add test case

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* add test cases

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* use consts

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* add coverage

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* make sure pid is exited

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2024-10-14 11:12:49 +02:00
Sammy Kerata Oina 899bfb0ec5 COCOS-151 - Add compression/decompression option for CLI/Agent (#200)
* on the fly compression

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* rename file-hash to checksum

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* check error properly

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix lint

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix connection handling

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2024-08-21 11:54:52 +02:00
Sammy Kerata Oina 31391a36a1 NOISSUE - Add optional Filename (#195)
* add optional filename

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* cover all bases

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* maintain filename

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2024-08-09 13:09:11 +02:00
Sammy Kerata Oina 67d01e39be COCOS-155 - Add python algo support (#178)
* * feat(algorithm.go): add support for algorithm type context
* feat(python.go): implement Python algorithm runtime
* fix(cocos_defconfig): add IPTABLES package

Signed-off-by: SammyOina <sammyoina@gmail.com>

* update proto

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* small fixes

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* add metadata

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* debug

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* debug

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* chunk logger

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* debug logger

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* test lock

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* add req file

Signed-off-by: SammyOina <sammyoina@gmail.com>

* stream result

Signed-off-by: SammyOina <sammyoina@gmail.com>

* test with venv

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* fix missing requirements file

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* result stream

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* modify test server

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* remove debugging and cleaning up

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* original repo

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* add missing header

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

* downgrade protoc

Signed-off-by: Sammy Oina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
Signed-off-by: Sammy Oina <sammyoina@gmail.com>
2024-07-23 16:38:03 +02:00
Sammy Kerata Oina 2ce112cc1b COCOS-103 - User authN and AuthZ using digital signatures (#128)
* Update Go to 1.22 and enhance security features

- Upgraded the Go version in GitHub Actions workflows to 1.22.x for latest features and security patches.
- Added RSA public key field `UserKey` in `Dataset` and `Algorithm` to reinforce data integrity and encryption.
- Refactored `Result` method in `agentService` to use `containsID` for improved readability and potential performance benefits.
- Updated `grpcserver.New` and `internal/server/grpc` invocations to pass `agent.Service` by value in line with recommended Go practices.
- Introduced `grpc.StreamInterceptor` with no args in `Server.Start` which seems to be an initial step for future stream interceptor configuration.

These changes prepare for stronger data security measures, maintain compatibility with the latest Go features, and improve code quality regarding service struct usage. Potential follow-up is needed to configure the stream interceptor and to ensure the new RSA key field is appropriately utilized in data handling.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor auth system and protocol buffers

Enhanced the authentication system by adding context support and an improved user-role model. Implemented robust RSA public key verification for users and a restructured interceptor logic specific to stream types, streamlining the auth process. Updated protocol buffers and associated structures to accommodate user keys as byte slices, aligning with standard cryptographic practice. CLI commands for algorithms and datasets now require a private key file path argument for signing, strengthening security during interactions.

This comprehensive overhaul addresses security and efficiency considerations in the RPC framework and aligns with best practices for key handling. By streamlining and securing the user authentication process, the agent service's reliability is greatly improved, directly impacting the robustness of the entire computation pipeline.

- Refactored auth: added role-based user validation, context handling
- Reworked interceptors: separated stream types, fortified signature checks
- Updated protocol buffers: user public keys as byte slices for standard compatibility
- Enhanced CLI: introduced private key argument, ensuring secure algorithm and dataset submission
- Improved server and SDK contracts to align with auth changes

Related issues:
- Implements user roles and auth context [#103]
- CLI security enhancement for private key management

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Updated PEM decoding for key parsing in CLI and tests

Added `encoding/pem` to decode PEM blocks when parsing private and public keys across CLI commands and test computation scenarios, ensuring compatibility with key files. This enhances robustness in key handling by supporting PEM encoded keys. The update also includes registration of a new Keys command in the CLI.

Refactored code is now compliant with common key formats, addressing potential parsing issues.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix auth signature encoding and improve CLI usage example

The authentication system now decodes base64 strings before verifying signatures to align with the expected format. Additionally, the signature generation now encodes the output in base64, ensuring consistency across the auth process.

The CLI help message for the `result` command is enhanced by providing a usage example, making it more user-friendly and informative.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor containsID to handle dynamic fields

Updated the `containsID` function to accept a field name parameter, enabling dynamic field lookup within the reflection logic. This change facilitates the use of the function for various struct fields, improving code reusability and flexibility.

CLI command 'data' now requires an additional argument for the private key file path, outlined in the usage example update, reinforcing command clarity and user guidance.

Resolves issues with hardcoded field lookups and enhances CLI usability.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Remove extraneous newline in key generation log output

A redundant newline after the success message in the key generation command was removed to clean up log output formatting. This change ensures a more consistent and professional appearance of the CLI tool's messages.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Implemented auth service in gRPC startup

Added authentication services to the gRPC server initialization to enforce security measures. The gRPC server's New function now includes an `authSvc` parameter, requiring instantiation of the auth service before starting the server. Failure to create the auth service results in a fatal error, halting the process to avoid running without protection. Tests have been updated to include `nil` values for the auth service parameter to maintain their functionality without authentication.

Refactored `grpcserver.New` to accept the new auth service, and updated the main agent startup logic to create and inject the auth service. Added the auth middleware interceptors to the server options, which ensures that each gRPC call will undergo authentication.

This change is a step towards secure communication, and affected components should now consider the authentication requirement.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor config read logic and update agent setup

Improved the configuration reading in `cmd/agent/main.go` to handle larger payloads by reading data in chunks and checking for EOF, ensuring that all config data is captured even if it exceeds the initial buffer size. Enhanced the `test/manual/agent-config/main.go` to require additional command-line arguments, improving the setup process by explicitly requiring paths for data, algorithm, and public key as well as a boolean for attested TLS. Also updated the hashing method to SHA3 for the algorithm and data files, and included the hash and public keys as part of the agent, dataset, and result consumer configurations. These changes will make the agent setup more robust and provide better integrity checks for the involved files.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor run method to agentService

Moved the run function into agentService for better encapsulation and maintainability. This refactoring includes capturing both stdout and stderr during algorithm execution, enabling more informative debugging through enhanced logging. Consequentially, the run method now references members through the service instance, aligning with object-oriented best practices and improving code coherence.

Resolves issue with insufficient execution details when computations fail.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor computation data handling to use filepaths

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor error logging and ensure consistency

Replaced usage of the standard log package with a custom logger for error reporting to standardize error logging throughout the application. Additionally, introduced graceful shutdown by returning from the main function rather than forcing exit when failing to create auth service, aligning the application's error handling strategy.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor auth initialization and key file handling

Improved the readability and maintainability of the authentication service initialization by adding line breaks for logical separation. Also, standardized key filenames in the CLI key generation by introducing constants, enhancing code clarity and reducing the likelihood of file-naming errors.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor auth verification logic for improved security

Removed an extraneous line in the `verifySignature` function that was not necessary for the signature verification process. This change simplifies the code and improves readability.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor payload structures to simplify API

Removed the 'provider', 'id', 'consumer' fields from protocol buffers, gRPC services, and related functions across various files to streamline the data model and align with the new authentication system based on cryptographic verification rather than string identifiers. This results in more efficient data handling and a reduction in unnecessary payload data, while enhancing security by making entity validation strictly cryptographic.

The changes affect agent-SDK interactions, CLI tools, and related services, ensuring only the necessary data (algorithm/data bytes, user keys, and hashes) is transmitted and processed. Consequently, the core computation algorithm and dataset handlers now rely on indexes derived from context to associate data with respective manifest entries, thus maintaining the ability to link to specific computation manifests without relying on explicit IDs in the payload. Additionally, refactored authentication methods now enforce role-based security seamlessly through metadata.

This approach enhances privacy by avoiding transmission of potentially sensitive strings over the network and by ensuring that only internal indices, not globally interpretable identifiers, are used to process computations.

Aligned with the broader architectural goal of simplifying and securing the platform's core services, this change paves the way for upcoming revisions to the authentication scheme that will further consolidate role-based security and improve system integrity.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Enhance CLI security with key paths

Removed the section on running computations from the CLI README as it may no longer be necessary or the functionality has been moved elsewhere. Required private key file paths for algorithm, dataset upload, and result retrieval commands to enhance security. This change associates each action with a specific identity, ensuring secure and traceable operations. Additionally, updated the manual test commands to reflect this new requirement.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* fix ci

Signed-off-by: SammyOina <sammyoina@gmail.com>

* fix fmt

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-05-28 14:10:13 +02:00
Sammy Kerata Oina 8d082567d7 COCOS-101 - Switch to self-contained algorithms as executables (#119)
* Switch to self-contained algorithms as executables

Transitioned from using Python scripts to self-contained binary executables for running algorithms, improving modularity and reducing dependencies. This change removes the reliance on a Python environment, as evident by the removal of Python setup and packages from the build configuration. The service now creates temporary executable files for algorithm runs, handling all permissions and cleanup, enhancing security and maintaining clean execution states. A warning is logged if computation fails, aiding in debugging. Additionally, updated manual tests to reflect these changes in the agent's handling of algorithms.

Refactors:
- Removed Python runtime const since it's no longer needed.
- Updated documentation and test commands to reflect the change from .py to .bin for algorithm files.

Build config:
- Removed Python and pip packages to reduce the build size and complexity.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Update agent service.go file with new constants and file permission

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refine singular usage of 'algorithm' across modules

Standardized terminology throughout the project to refer to 'algorithm' in the singular form rather than plural. Streamlined various documentations, string constants, function names, and variable names to bring cohesiveness and eliminate ambiguity when handling algorithms across README files, CLI interfaces, and internal API representations.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix state names and indices in state_string.go

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-04-29 15:48:17 +02:00
Sammy Kerata Oina 64f7e7f7fd NOISSUE - Refactor single algorithm processing (#117)
* Refactor single algorithm processing

Simplified the agent service's algorithm handling logic to process a single algorithm instead of multiple. This change:
- Removed the `Algorithms` type and associated stringer implementation.
- Updated the state machine and service logic to expect a singular algorithm, aligning the agent's internal state transitions with the new model.
- Adjusted the manager service and computations test server to mirror these changes in their respective payload structures, ensuring API and test consistency.
- Altered README files to reflect the simplified interaction model and removed outdated descriptions.
- Reverted the protoc-gen-go version used for generating protobuf files to maintain compatibility with the rest of the codebase.

The single-algorithm approach streamlines the computation running process, reducing complexity and potential error conditions. It directly impacts how external services will construct and send computation requests.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Update protoc-gen-go version to v1.33.0

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor variable name in computations.go and grpc.go

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-04-17 13:34:54 +02:00
Danko Miladinovic 3a14896555 NOISSUE - Attested TLS (#99)
* added initial code for attested TLS

* added client validation and verification

* fixed bugs for attested TLS

* updated README for manual testing

* fixed CI errors

* removed SNP pollicy from agent config

* added attested TLS config param to AgentConfig

* generated manager.pb.go for protoc v25.2

* updated proto-gen-go version on CI

* generated agent.pb.go to match newest proto gen version

* define errors for error handling

* fixed comments
2024-04-16 15:19:19 +02:00
Sammy Kerata Oina 997fb3bf48 COCOS-83 - Add hash verification for datasets and algorithms (#84)
* Add hash verification for datasets and algorithms

Enhanced data integrity checking by incorporating hash fields in Dataset and Algorithm structures, and modified the corresponding service logic to validate these hashes during processing. The update includes SHA-3 for hash computation, replacing the former SHA-256 usage, and ensures the provided data matches the expected hash from the manifest to prevent processing malformed or corrupted data.

- Introduce `Hash` field to both Dataset and Algorithm structs to store the expected hash value.
- Implement SHA-3 hashing within service methods that process the data, ensuring consistency with newly added `Hash` fields.
- Add error handling for hash mismatches, preventing further processing and alerting to potential data integrity issues.
- Update Protocol Buffers serialization to accommodate the new hash fields for gRPC communication.
- Modify manager service's Run method to pass the hash information when creating agent configurations.

Go module dependencies were updated to include the new SHA-3 package and upgrade Go version to 1.21.6 for improved stability and compatibility.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Remove identifiers from protobuf and related code

The protobuf definitions and related service handling code have been revised to drop specific identifier fields (`AlgorithmID` and `DatasetID`) to simplify API responses and internal function signatures. These removals streamline the overall data flow between components, reduce unnecessary data transmission, and lead to an aligned server-client expectation where identifiers are no longer a part of the response payload. Consequently, these changes simplify the logic within various functions and client commands, reinforcing encapsulation by ensuring that internal identifiers do not need to be managed or exposed unnecessarily.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* fix lint

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-02-19 15:31:35 +01:00
Sammy Kerata Oina 938dd6cb78 NOISSUE - mTLS support across services (#71)
* Implemented mTLS support across services

Extended gRPC configuration to support mutual TLS (mTLS) in agent and manager components for enhanced security. This includes the loading of Certificate Authority (CA) certificates, server, and client certificates, and keys. Updated README documentation to reflect the new environment variables required for mTLS configuration. Additionally, streamlined secure gRPC client connection setup and logging messages to indicate whether a service is running with TLS, mTLS, or without TLS.

The change ensures secure communication between services by verifying both client and server identities, thus addressing potential security concerns in network-level interactions.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Enhance agent cert handling and update copyright

- Implement function to create certificate files for the agent configuration dynamically, ensuring file paths are updated to reflect newly created files. This improves the agent's setup process by automating the certificate handling.
- Update copyright clause to reflect the new owning entity, Ultraviolet, affirming correct attribution and compliance with legal requirements.
- Refactor gRPC client connection code to remove redundant package alias, streamlining the codebase and improving readability.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor cert loading with fallbacks

Removed redundant certificate file creation logic in the agent module and introduced a more robust loading mechanism in the gRPC server module to support direct byte content aside from file paths. This change simplifies the initial setup process for the agent by removing the need to create certificate files preemptively, thereby streamlining deployment in environments with varying filesystem access. It supports using certificate contents directly, enhancing compatibility with in-memory configurations or environments where file storage may not be ideal.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* fix lint

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-02-08 10:07:51 +01:00
Sammy Kerata Oina 046b549079 Remove HTTP server support and streamline config (#75)
The HTTP server-related code, documentation, and configurations have been removed as part of a shift towards prioritizing gRPC for service communication. This update includes deletions of HTTP host and port configs across various components, the manager HTTP API alongside its Swagger definition, and the removal of related scaffolding and utility code. This change simplifies the overall architecture and eliminates redundant HTTP support, focusing on optimizing gRPC performance and security features.

Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-02-07 11:03:46 +01:00
Sammy Kerata Oina 34c3bbdbd8 COCOS-55 - Pass agent logs to manager via vsock (#64)
* Simplify event handling and config

Streamlined event service interface by consolidating `SendEvent` and introducing `SendRaw`. Removed `notification_server_url` and `instance_id` parameters from several event publication calls to leverage centralized event construction. This change not only cleans up redundancy in event-related code but also simplifies the configuration data flow across the system, making it easier to manage and less error-prone. Uniform event generation now improves consistency and maintainability.

Refactored configuration management in the agent and manager services. Removed notifications URL from the agent configuration, relying on a simplification that assumes a single source of events. Updated Manager Port to VsockConfigPort for clarity and consistency across vsock communication.

These modifications should facilitate easier integration and extension of event and configuration systems in the future.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* fix lint

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor error handling in agent event forwarding

Introduced context and error channel handling to the agent event forwarding process. The logger now warns on errors during forward operations asynchronously, allowing for non-blocking error reporting. Additionally, reliance on the global logger was removed in favor of passing error information via channels, improving modularity and error flow control.

Resolves issue with silent forwarding failures by providing a means to alert system operators without halting the service. This enhancement makes the error reporting more robust and reactive while maintaining service continuity.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* remove unused field

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Enhance agent logging via vsock connection

Redirected agent logging to use a vsock connection instead of standard output, improving the process isolation and enabling centralized log management. The change involved dialing to the specified vsock log port and initializing the logger with the vsock connection rather than stdout.

Additionally, the manager service now maintains a map of agent vsock cids to computation IDs, providing better tracking of computation resources. A routine to retrieve logs from agents was also initiated during the service setup to facilitate log collection.

Consequential to these changes is the removal of a redundant os package import in the agent's main.go, further cleaning up the dependencies.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* fail gracefully

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Updated backoff strategy and VM configurations

- Added `github.com/cenkalti/backoff` to direct dependencies for robust retry logic in agent configuration sending.
- Modified the vsock logs port to align with the updated port range standards.
- Enclosed kernel console arguments in quotes to ensure proper parsing in QEMU configurations.
- Implemented exponential backoff when sending agent configurations to handle transient failures.

Refactors:
- Streamlined creation of `AgentConfig` within the computation setup to avoid unnecessary initializations when `c.AgentConfig` is not nil.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor command execution and improve argument construction

Consolidated the error handling in the command execution function for better readability. In the QEMU configuration, the argument assembly process is enhanced for clarity and correctness; the VNC parameter is now separate, and string quoting is handled properly for kernel parameters. These changes result in more maintainable code and prevent potential formatting issues during QEMU argument parsing.

Resolves issues with argument construction in QEMU config module.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refine default config handling and unpacking

Improved the agent configuration by dynamically setting default values for the log level and port if they are not specified in the incoming configuration. Also streamlined configuration unpacking in the endpoint and service layers, reducing redundancy and ensuring all required fields are correctly copied over to the Manager's configuration structure. This change ensures better fault tolerance and more maintainable code by handling edge cases where configuration values might be missing.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* rename dir

Signed-off-by: SammyOina <sammyoina@gmail.com>

* fix lint

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Ensure runRes.Empty() reflects non-empty state

Changed the always-true return value of the `runRes.Empty()` method to `false` to accurately indicate the presence of a response body. This adjustment ensures downstream handling of API responses aligns with actual content state.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Replace mglog with slog across codebase

Updated various components to replace the `mglog` logger implementation with the `slog` logger. This change affects logging initialization and calls throughout the codebase including the agent, manager, and internal server components. Transitioning to `slog` is part of a broader shift to standardize the logging mechanism to improve maintainability and consistency.

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-01-22 14:13:47 +01:00
Sammy Kerata Oina 55afe4c038 COCOS-49 - Pass agent configuration and computation via vsock (#57)
* Optimize QEMU launch and add V-sock support

Refactored QEMU argument construction and launching logic by removing the dependency on 'agent.Computation'. This simplification makes the VM creation process more streamlined. Additionally, introduced V-sock capabilities in the QEMU configuration to facilitate improved guest-host communication. Updated the README to include kernel module setup instructions for the new V-sock feature.

The V-sock implementation enables VMs to use a consistent communication channel that is not affected by network configuration changes, enhancing reliability and potential interoperability with host services. It's important to ensure that the necessary kernel modules are loaded as part of the setup process, as documented.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Add vsock-based communication to manager

Introduced virtual socket (vsock) communication abilities in the manager package by implementing a new socket service. This includes establishing a vsock listener and stub methods for sending computation results and cleaning up resources. The addition provides the groundwork for interprocess communication between guest and host in virtualized environments.

- Integrated the `mdlayher/vsock` library for handling virtual socket operations.
- Created a new `sockService` struct to encapsulate vsock listener handling.
- Implemented `NewVsock` constructor to initialize the listener with domain value `3`.
- Added placeholder methods for future computation sending and service closing logic.

This enhancement targets scenarios where efficient VM-to-host communication is required.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* remove env

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor agent config and use vsock

Introduce `AgentConfig` struct to group agent-related configurations, and update `Computation` struct to include the new `AgentConfig` field. Replace command-line computation extraction with vsock-based config retrieval for robustness and decoupling. The agent configuration is now read from a vsock connection during runtime, allowing for more dynamic and flexible deployments. Adjusted the main agent application logic to support these configuration changes, and corresponding changes have been made in the manager to facilitate vsock communication.

This approach aligns with modern practices for microservices by streamlining configuration management and reducing reliance on static command-line parameters. Moreover, it enhances the scalability of the agent service by allowing configuration to be managed externally.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor agent config and remove deprecated code

Consolidated agent configuration management into a single `AgentConfig` message and pruned deprecated Protobuf `ComputationReq`, `DatasetReq`, and `AlgorithmReq` messages. Adapted corresponding manager service logic to the new configuration structure. These modifications align with updated manager API schema, facilitate clearer configuration handling, and improve maintainability.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* send configuration

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Switch agent to listen mode for manager connections

Previously, the agent established a connection to the manager using a direct dial. This change shifts the setup to where the agent listens on a specified port and accepts incoming connections. It ensures that the agent properly handles incoming requests by initiating a listening socket and waiting for the manager to connect, enhancing the system's flexibility in connection management. This adjustment also includes graceful closure of the listening socket.

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-01-19 17:24:51 +01:00
Sammy Kerata Oina c29ef354fa COCOS-24 - Add events service via http (#27)
* add status endpoint

Signed-off-by: sammy <sammyoina@gmail.com>

* feat: Update code generation tools to latest versions

Update the code generation tools, including protoc-gen-go and
protoc-gen-go-grpc, to their latest versions (v1.31.0 and v1.3.0,
respectively). This ensures compatibility with the latest features and
improvements. The updated tools also require gRPC-Go v1.32.0 or later.

The new versions bring important updates and bug fixes, enhancing the
performance and stability of the generated code. By staying up-to-date
with the latest tooling, we can take advantage of the latest
functionality and ensure a smooth development experience.

No code changes are included in this commit. These updates only impact
the code generation process.

Signed-off-by: sammy <sammyoina@gmail.com>

* Update Go version to 1.21.x

Signed-off-by: sammy <sammyoina@gmail.com>

* Refactor agent and manager services to publish event notifications

The refactoring includes changes to the agent and manager services to incorporate event notifications. By publishing events, the services can inform subscribers about the current state of the computation or any updates. Specifically, the `agentService` now includes a `cmpHash` field to store the SHA-256 hash of the computation, which is subsequently used when publishing events. The `agentService` and `managerService` now use the `publisher` interface to publish events to the topic "manager". Notably, the removed `pubsub.go` file is no longer necessary.

This commit improves the service architecture by allowing subscribers to receive relevant updates and monitor the progress of computations. It enhances the overall system by providing more transparency and enabling better coordination between the agent and manager services.

Signed-off-by: sammy <sammyoina@gmail.com>

* Improve generated Go file comparison in checkproto workflow

Refactor the file comparison logic in the checkproto workflow to use the `-p` flag instead of `-s` for improved accuracy. This change ensures that the generated Go files are thoroughly compared with the original ones, detecting any discrepancies and preventing out-of-sync files from passing the validation. By using the `-p` flag, we now check both the contents and the metadata of the files, providing more robust synchronization checks. This update enhances the reliability of the checkproto workflow and helps maintain consistency between the proto files and their corresponding generated Go files.

Signed-off-by: sammy <sammyoina@gmail.com>

* Update file comparison command to detect differences line by line

The code change updates the file comparison command used in the CI workflow to detect differences line by line instead of only reporting the first difference encountered. This change improves the accuracy of detecting inconsistencies between the original protobuf files and the generated Go files. Previously, only the first difference was reported, leading to potential missed issues. By comparing the files line by line, we can now detect and report all differences accurately. This change enhances the reliability of our CI pipeline and ensures that the generated Go files stay in sync with the protobuf files.

Signed-off-by: sammy <sammyoina@gmail.com>

* add event exporting to external server

Signed-off-by: sammy <sammyoina@gmail.com>

* feat: Add support for notification server URL

The commit adds a new environment variable, `COCOS_NOTIFICATION_SERVER_URL`, which allows specifying the server to receive notification events from the agent. This addition provides flexibility to configure the notification server URL based on the deployment environment. This change enables seamless integration with different notification server instances and enhances the extensibility of the system. It resolves the need to modify the code directly when changing the server URL.

Signed-off-by: sammy <sammyoina@gmail.com>

* Refactor gRPC client and server, remove unused handlers

The commit refactors the gRPC client and server code by removing the unused `nopDecoder` and `status` handlers from the client and server, respectively. This cleanup reduces code clutter and improves maintenance. No significant consequences are expected.

Signed-off-by: sammy <sammyoina@gmail.com>

* Ensure generated Go files stay in sync with proto files during the CI workflow

Fixes an issue in the CI workflow where proto files and their corresponding generated Go files were not being properly compared for synchronization. Previously, the `cmp -l` command was used, which only printed differing byte positions, leading to false negatives. This has been corrected by using `cmp -s` instead, which outputs nothing if the files are identical. This change ensures that any differences between the proto files and their generated Go files will be detected, helping to maintain consistency and accuracy in the codebase.

Signed-off-by: sammy <sammyoina@gmail.com>

* Enhance notification payload and endpoint

Extended the notification system to include 'status' and 'details' in the payload, improving traceability and debugging. Adapted the serialized JSON structure for clarity and added an 'originator' field to track the source service. Transitioned to a generalized event endpoint, facilitating a more streamlined event handling process.

Refactors POST request to a more appropriate endpoint and updates the notification service interface to reflect new payload requirements.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor event notification logic

Removed the legacy notifications package and consolidated event notification functionality using the new internal events service. Modified agent, manager, and main application code to use this service for consistent event reporting and error handling workflows across services. This change simplifies event management, improves error visibility, and allows for more maintainable code by centralizing event-related logic. The substitution of verbose state-specific publishEvent calls with generic status reporting aligns with the new service's capabilities.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Expand agent protobuf message types and improve error handling

The protobuf definition for agent messages has been updated to include an additional message type, facilitating future data structure expansions. Additionally, error handling for event sending in the main agent execution has been enhanced to log errors when sending 'init' events fail, ensuring issues are properly tracked. The unused `notificationTopic` constant in the manager service has been removed for cleaner code maintenance.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Optimize JSON parsing and fix header omission

Removed unnecessary unquoting of a JSON string before unmarshaling, streamlining the computation value extraction process. Also corrected a missing Content-Type header in the event sending function, ensuring proper handling of JSON requests by recipients. These changes improve performance and communication reliability.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* align vars

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Remove computation request timeout feature

The timeout feature for computation requests has been removed to simplify the computation execution flow. This involved changes across multiple files, including protobuf definitions, HTTP endpoint handling, and the internal computation logic. We eliminated the timeout field, associated logic, and error handling to ensure the system no longer supports timeouts for computations, mitigating any unintended timeout impacts on long-running processes.

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: sammy <sammyoina@gmail.com>
Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-01-11 14:37:14 +01:00
Sammy Kerata Oina ee7159a406 COCOS-35 - Refactor RunRequest to use structured Computation (#38)
* Refactor RunRequest to use structured Computation

The protobuf and associated service implementations for the RunRequest message were refactored to replace the raw Computation byte slice with a structured ComputationReq object. This allows clearer and more type-safe manipulation of computation requests. The grpc, http, and agent service layers were updated to build and parse ComputationReq accordingly. The ComputationReq structure includes details like IDs, names, time stamps, and metadata, forming a well-defined contract for computation tasks.

This change aligns with efforts to standardize request formats and improve clarity in inter-service communication. It impacts all systems interfacing with the RunRequest service and thus requires coordinated updates to the entire stack.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Initialize metadata maps and handle nil values

Improved the robustness of metadata handling in gRPC endpoints and SDK by initializing metadata maps and explicitly checking for nil values before converting them. This ensures that both the agent's gRPC endpoint and the SDK properly handle cases where metadata fields may be uninitialized or contain nil values, preventing potential null pointer exceptions.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Refactor computation request handling

Refactored the endpoint to construct Computation object from gRPC request, incorporating structpb for metadata handling and timestamppb for StartTime and EndTime fields. The management service and API requests are also updated to align with these changes, improving type safety and ensuring data is correctly marshalled when making service calls.

Resolves data marshalling issues for computation requests.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* use singular

Signed-off-by: SammyOina <sammyoina@gmail.com>

* remove unuse fields

Signed-off-by: SammyOina <sammyoina@gmail.com>

* remove unused fields

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2024-01-08 17:51:46 +01:00
Sammy Kerata Oina 298878c96d NOISSUE - Rename module to cocos (#22)
* add stringer

Signed-off-by: SammyOina <sammyoina@gmail.com>

* rename module to cocos

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix fmt.Stringer implementation in computations.go

The fmt.Stringer implementation for Datasets and Algorithms in computations.go was fixed to correctly use pointers.

This commit addresses the issue where the String() method for Datasets and Algorithms in computations.go was not correctly implemented. The fix ensures that the String() method now correctly marshals the data to JSON and returns the string representation.

The changes made in this commit will improve the functionality and accuracy of the String() method for Datasets and Algorithms.

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2023-11-22 13:15:19 +01:00
Sammy Kerata Oina 4958f9468f NOISSUE - Implement State-machine (#16)
* add state machine

Signed-off-by: SammyOina <sammyoina@gmail.com>

* This commit addresses the following changes:

- In agent/service.go:
  - Added a goroutine to start the StateMachine
  - Sent the start event to the StateMachine

- In agent/state.go:
  - Added a copyright notice and license identifier

These changes ensure that the StateMachine starts correctly and that the agent/state.go file includes the necessary copyright and license information.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix race condition in agent service initialization

The commit fixes a race condition in the agent service initialization.
Previously, the `New` function in `agent/service.go` would start the state machine
goroutine before initializing the `sm` field. This could lead to a race condition
where the state machine would access uninitialized fields.

To fix this, the `New` function now takes a `context.Context` argument and passes it
to the `Start` method of the state machine. Additionally, a `sync.Mutex` has been
added to the `StateMachine` struct to ensure thread safety.

This commit addresses the race condition and ensures proper initialization of the agent service.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix sending events when algorithms and datasets are received

This commit fixes a bug where events were not being sent when algorithms and datasets were received in the agent service. The bug was causing the events to not be triggered, leading to incorrect behavior. This commit adds the necessary code to send the events when algorithms and datasets are received, ensuring that the events are properly triggered.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix bug: computation results not ready

The computation results were not being marked as ready,
resulting in an error when trying to access them.
This commit fixes the issue by adding a check for
the computation results before returning them.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix bug in agent service and state

The commit fixes a bug in the agent service and state files.
Previously, the condition to check the state in the agent service
was incorrect. It was checking the state directly instead of
using the GetState() method. This has been fixed by using the
GetState() method to check the state.

Additionally, a new GetState() method has been added to the
StateMachine struct in the state file. This method retrieves
the current state by acquiring a lock and returning the state
value.

The changes have been tested and verified to resolve the bug
and improve the accuracy of state checking in the agent service.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* check for time outs

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix bug in agent state machine

The bug in the agent state machine caused an error when attempting an invalid transition. This commit fixes the bug by properly locking and unlocking the state machine before and after transitioning to the next state. Additionally, the logger now correctly logs the current and next state during a valid transition.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix race condition in state machine

The commit fixes a race condition in the state machine implementation in the `Start` method. The race condition occurs when multiple goroutines try to access and modify the state concurrently. To fix this, a mutex lock and unlock are added around the critical sections of code to ensure exclusive access to the state variable. This prevents race conditions and ensures the state transitions are executed correctly.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix race condition in StateMachine.Start()

The StateMachine.Start() method was experiencing a race condition
when multiple events were being processed concurrently. This was
caused by not properly locking and unlocking the state machine
before and after updating the state. This commit fixes the issue
by adding proper locking and unlocking around the state update
operation. Additionally, the logging statement has been updated
to include the previous and next states for better debugging.

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2023-11-10 14:41:05 +01:00
Sammy Kerata Oina 8c7d4a76ec NOISSUE - Remove HTTP server from agent and update agent service (#15)
* Update agent.pb.go to include a new field "Provider" in the AlgoRequest struct.

Signed-off-by: SammyOina <sammyoina@gmail.com>

* fix build

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix undeclared consumer error in agent service

The commit fixes an error in the agent service where an undeclared consumer was requesting results in the computation manifest. The error is resolved by adding an error message for the undeclared consumer case.

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: SammyOina <sammyoina@gmail.com>
2023-11-10 13:52:25 +01:00
Sammy Kerata Oina f1f8f95653 NOISSUE - Fix dependencies (#13)
* vendor

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>

* Return agent changes

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>

* Add missing import

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>

* remove vendor

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>

* Fix formatting

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>

* Formatting errors

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>

* Update agent/api/grpc/client.go

Signed off: WashingtonKK

washingtonkigan@gmail.com

Co-authored-by: Sammy Kerata Oina <44265300+SammyOina@users.noreply.github.com>

* add linters and fix

Signed-off-by: SammyOina <sammyoina@gmail.com>

* update ci

Signed-off-by: SammyOina <sammyoina@gmail.com>

* remove deprecated dependencies and
 use local agent and manager

Signed-off-by: SammyOina <sammyoina@gmail.com>

* update mainflux

Signed-off-by: SammyOina <sammyoina@gmail.com>

* Fix Jaeger URL in agent and manager main.go files

The Jaeger URL in the agent and manager main.go files was incorrect. This commit fixes the Jaeger URL by updating it to "http://localhost::4318/v1/traces".

Signed-off-by: SammyOina <sammyoina@gmail.com>

---------

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>
Signed-off-by: SammyOina <sammyoina@gmail.com>
Co-authored-by: WashingtonKK <washingtonkigan@gmail.com>
2023-10-27 22:12:53 +02:00
Darko Draskovic a6d0734212 Refactor run request to use slice of bytes to represent computation
Signed-off-by: Darko Draskovic <darko.draskovic@gmail.com>
2023-05-30 11:27:43 +02:00