Like all application monitoring solutions, there is some fixed work that the Python agent will perform for each transaction. Monitoring your is code never totally "free". That said, we always try to keep this work to a minimum and in most situations, it is hardly noticeable and well worth the cost for the power that New Relic brings to understanding the performance of your services and applications.
For example, on a typical Django view request that takes a few hundred milliseconds, the work the agent does is a small proportion of the total time of the whole request and response cycle. However, if you are monitoring a very lightweight API where calls to its endpoints run in just a few milliseconds, you may see the agent’s work as a larger proportion of the total work.
While some of the work performed by the agent is fixed, there are a few tweaks that can be made to help reduce the CPU cycles used. In your particular case, we suggested that, if possible, you could disable the Redis instrumentation to help save some CPU processing power. The amount that this will help will depend on how Redis is used in the application.
You may also want to consider whether it is important to monitor a particular endpoint at all. If you are already monitoring the applications which call an endpoint, the data the callers report on the response time of their requests to this service may be sufficient for your needs. If certain endpoints on the API service are particularly lightweight, you could use the New Relic API to ignore transactions for those specific endpoints.