Limitations of v1
This document describes some of the limitations of
v1 of the AEA framework and tradeoffs made in its design.
Handlers implemented as behaviours:
Handlers can be considered a special cases of a "behaviour that listens for specific events to happen".
One could implement
Handler classes in terms of
Behaviours, after having implemented the feature that behaviours can be activated after an event happens (e.g. receiving a message of a certain protocol).
This was rejected in favour of a clear separation of concerns, and to avoid purely reactive (handlers) and proactive (behaviours) components to be conflated into one concept. The proposal would also add complexity to behaviour development.
Multiple versions of the same package
The framework does not allow for the usage of multiple versions of the same package in a given project.
Although one could re-engineer the project to allow for this, it does introduce significant additional complexities. Furthermore, Python modules are by design only allowed to exist as one version in a given process. Hence, it seems sensible to maintain this approach in the AEA.
Potential extensions, considered yet not decided:
Alternative skill design
For very simple skills, the splitting of skills into
Task classes can add unnecessary complexity to the framework and a counter-intuitive responsibility split. The splitting also implies the framework needs to introduce the
SkillContext object to allow for access to data across the skill. Furthermore, the framework requires implementing all functionality in
Model. This approach is consistent and transparent, however it creates a lot of boiler plate code for simple skills.
Hence, for some use cases it would be useful to have a single
Skill class with abstract methods
teardown. Then the developer can decide how to split up their code.
Alternatively, we could use decorators to let a developer define whether a function is part of a handler or behaviour. That way, a single file with a number of functions could implement a skill. (Behind the scenes this would utilise a number of virtual
Handler classes provided by the framework).
The downside of this approach is that it does not advocate for much modularity on the skill level. Part of the role of a framework is to propose a common way to do things. The above approach can cause for a larger degree of heterogeneity in the skill design which makes it harder for developers to understand each others' code.
The separation between all four base classes does exist both in convention and at the code level. Handlers deal with skill-external events (messages), behaviours deal with scheduled events (ticks), models represent data and tasks are used to manage long-running business logic.
By adopting strong convention around skill development we allow for the framework to take a more active role in providing guarantees. E.g. handlers' and behaviours' execution can be limited to avoid them being blocking, models can be persisted and recreated, tasks can be executed with different task backends. The opinionated approach is thought to allow for better scaling.
Further modularity for skill level code
Currently we have three levels of modularity:
- PyPI packages
- framework packages: protocols, contracts, connections and skills
- framework plugins: CLI, ledger
We could consider having a fourth level: common behaviours, handlers, models exposed as modules which can then speed up skill development.
Given the asynchronous nature of the framework, it is often hard to implement reactions to specific messages, without making a "fat" handler. Take the example of a handler for a certain type of message
A for a certain protocol
p. The handler for protocol
p would look something like this:
However, it could be helpful to overwrite this handler reaction with another callback (e.g. consider this in context):
This feature would introduce additional complexity for the framework to correctly wire up the callbacks and messages with the dialogues.
CLI using standard lib
Removing the click dependency from the CLI would further reduce the dependencies in the AEA framework which is overall desirable.
Meta data vs configurations
The current approach uses
yaml files to specify both meta data and component configuration. It would be desirable to introduce the following separation:
- package metadata
- package default developer configuration
- package default user configuration
A user can only configure a subset of the configuration. The developer should be able to define these constraints for the user. Similarly, a developer cannot modify all fields in a package, some of them are determined by the framework.
Configuring agent goal setup
By default, the agent's goals are implicitly defined by its skills and the configurations thereof. This is because the default decision maker signs every message and transaction presented to it.
It is already possible to design a custom decision maker. However, more work needs to be done to understand how to improve the usability and configuration of the decision maker. In this context different types of decision makers can be implemented for the developer/user.
Connection status monitoring
Currently, connections are responsible for managing their own status after they have been "connected" by the
Multiplexer. Developers writing connections must take care to properly set its connection status at all times and manage any disconnection. It would potentially be desirable to offer different policies to deal with connection problems on the multiplexer level:
- disconnect one, keep others alive
- disconnect all
- try reconnect indefinitely
Agent snapshots on teardown or error
Currently, the developer must implement snapshots on the component level. It would be desirable if the framework offered more help to persist the agent state on teardown or error.
The current implementation of Dialogues is verbose. Developers often need to subclass
Dialogue classes. More effort can be made to simplify and streamline dialogues management.
Instantiate multiple instances of the same class of
Currently, configuration and metadata of a package are conflated making it not straightforward to run one package component with multiple sets of configuration. It could be desirable to configure an agent to run a given package with multiple different configurations.
This feature could be problematic with respect to component to component messaging which currently relies on component ids, which are bound to the package and not its instance.
Agent management, especially when many of them live on the same host, can be cumbersome. The framework should provide more utilities for these large-scale use cases. But a proper isolation of the agent environment is something that helps also simple use cases.
A new software architecture, somehow inspired to the Docker system. The CLI only involves the initialization of the building of the agent (think of it as the specification of the
Agentfile), but the actual build and run are done by the AEA engine, a daemon process analogous of the Docker Engine, which exposes APIs for these operations.
Users and developers would potentially like to run many AEAs of different versions and with differences in the versions of their dependencies. It is not possible to import different versions of the same Python (PyPI) package in the same process in a clean way. However, in different processes this is trivial with virtual environments. It would be desirable to consider this in the context of a container solution for agents.
Dependency light version of the AEA framework
v1 of the Python AEA implementation makes every effort to minimise the amount of third-party dependencies. However, some dependencies remain to lower development time.
It would be desirable to further reduce the dependencies, and potentially have an implementation that only relies on the Python standard library.
This could be taken further, and a reduced spec version for micropython could be designed.
Python is not a compiled language. However, various projects attempt this, e.g. Nuitka and it would be desirable to explore how useful and practical this would be in the context of AEA.
It would be great to integrate DID in the framework design, specifically identification of packages (most urgently protocols). Other projects and standards worth reviewing in the context (in particular with respect to identity):
Optimise protocol schemas and messages
The focus of protocol development was on extensibility and compatibility, not on optimisation. For instance, the dialogue references use inefficient string representations.
Constraints on primitive types in protocols
The protocol generator currently does not support custom constraints. The framework could add support for custom constraints for the protocol generator and specification.
There are many types of constraints that could be supported in specification and generator. One could perhaps add support based on the popularity of specific constraints from users/developers.
- strings following specific regular expression format (e.g. all lower case, any arbitrary regex format)
- max number of elements on lists/sets
- keys in one
dicttype be equal to keys in another
- other logical constraints, e.g. as supported in ontological languages
- support for bounds (i.e. min, max) for numerical types (i.e.
float) in protocol specification.
This would automatically enable support for signed/unsigned
float. This syntax would allow for unbounded positive/negative/both, or arbitrary bounds to be placed on numerical types.
Currently, the developer has to specify a custom type to implement any constraints on primitive types.
Subprotocols & multi-party interactions
Protocols can be allowed to depend on each other. Similarly, protocols might have multiple parties.
Furthermore, a turn-taking function that specifies who's turn it is at any given point in the dialogue could be added.
Then the current
fipa setup is a specific case of turn-taking where the turn shifts after a player sends a single move (unique-reply). But generally, it does not have to be like this. Players could be allowed to send multiple messages until the turn shifts, or until they send specific speech-acts (multiple-replies).
Timeouts in protocols
Protocols currently do not implement the concept of timeouts. We leave it to the skill developer to implement any time-specific protocol rules.
Framework internal messages
The activation/deactivation of skills and addition/removal of components is implemented in a "passive" way - the skill posts a request in its skill context queue (in the case of new behaviours), or it just sets a flag (in case of activation/deactivation of skills).
One could consider that a skill can send requests to the framework, via the internal protocol, to modify its resources or its status. The
DecisionMaker or the
Filter can be the components that take such actions.
This is a further small but meaningful step toward an actor-based model for agent internals.
Ledger transaction management
Currently, the framework does not manage any aspect of submitting multiple transactions to the ledgers. This responsibility is left to skills. Additionally, the ledger APIs/contract APIs take the ledger as a reference to determine the nonce for a transaction. If a new transaction is sent before a previous transaction has been processed then the nonce will not be incremented correctly for the second transaction. This can lead to submissions of multiple transactions with the same nonce, and therefore failure of subsequent transactions.
A naive approach would involve manually incrementing the nonce and then submitting transactions into the pool with the correct nonce for eventual inclusion. The problem with this approach is that any failure of a transaction will cause non of the subsequent transactions to be processed for some ledgers (https://ethereum.stackexchange.com/questions/2808/what-happens-when-a-transaction-nonce-is-too-high). To recover from a transaction failure not only the failed transaction would need to be handled, but potentially also all subsequent transactions. It is easy to see that logic required to recover from a transaction failure early in a sequence can be arbitrarily complex (involving potentially new negotiations between agents, new signatures having to be generated etc.).
A further problem with the naive approach is that it (imperfectly) replicates the ledger state (with respect to (subset of state of) a specific account).
A simple solution looks as follows: each time a transaction is constructed (requiring a new nonce) the transaction construction is queued until all previous transactions have been included in the ledger or failed. This way, at any one time the agent has only at most one transaction pending with the ledger. Benefits: simple to understand and maintain, transaction only enter the mempool when they are ready for inclusion which has privacy benefits over submitting a whole sequence of transaction at once. Downside: at most one transaction per block.
This approach is currently used and implemented across all the reference skills.
Related, the topic of latency in transactions. State channels provide a solution. E.g. Perun. There could also be an interesting overlap with our protocols here.
Unsolved problems in
Problem 1: connection generates too many messages in a short amount of time, that are not consumed by the multiplexer
Solution: Can be solved by slowing down connections receive method called, controlled by the inbox messages amount
Side effects: Most of the connections should have an internal queue because there is no synchronization between internal logic and multiplexer connection
Problem 2: the send method can take a long time (because send retries logic in connection) Solution: Currently, we apply timeouts on send. Other solutions could be considered, like parallelisation.
Problem 3: too many messages are produced by a skill. Solution: Raise an exception on outbox is full or slow down agent loop?
Agent mobility on ACN
If a peer-client or full client switches peer, then the DHT is not updated properly at the moment under certain conditions.
The two available connections
p2p_libp2p_client imply that the agent is continuously connected and therefore must have uninterrupted network access and the resources to maintain a connection.
For more lightweight implementations, a mailbox connection is desirable, as outlined in the ACN documentation.