How to Implement GPU-Based LLM Inference in AO