Skip to content

Sandbox for code execution

Our pipeline relies on Python interpreter to execute code generated by LLMs. This creates a security risk, since we are executing arbitrary code that we do not have full control over. To partially address this, we provide a basic sandbox that we use to execute code and validate the correctness of LLM-generated answers.

Local sandbox

The default sandbox option used in our pipeline is a local docker container. Check out nemo_skills/code_execution/local_sandbox for implementation details.

Please note that our provided sandbox is not fully secure and you are strongly encouraged to setup a properly configured virtual machine such that generated code executes in an unprivileged environment with no external network access unless necessary.

Piston sandbox

A better alternative is to host a Piston server in a properly configured VM. If you're using a Piston server (you need to host it yourself), add the following parameters to the relevant scripts

++sandbox_type=piston
++sandbox.host=<where your server is hosted, e.g. https://emkc.org/api/v2/piston>

Other sandboxes

Our sandbox API makes no assumptions on where or how the code is executed, so it's very easy to extend it. E.g. you can use AWS Lambda functions or other similar offerings. Please open an issue if you'd like us to add support for another sandbox in the future.