Skip down to main content

Reasoning with Machines AI Lab

interaction

Reasoning with Machines AI Lab

The challenge

Artificial intelligence (AI) and, in particular, large language models (LLMs), are revolutionising industries ranging from healthcare to customer service, but understanding their behaviour and ensuring their reliability and fairness is crucial.

At the Reasoning with Machines AI Lab we focus on benchmarking and evaluating AI systems, with a special emphasis on LLMs. We seek to answer critical questions such as: How can we accurately assess the capabilities of AI systems? How do these systems interact with humans? And how can we interpret their internal workings?

Our research

Driven by the need for transparency and trust in AI, the Reasoning with Machines AI Lab is advancing the benchmarking and evaluation of large language models to unlock their potential across various domains. Our approach combines human-computer interaction (HCI) with mechanistic interpretability. By both studying how humans engage with AI systems and exploring the inner mechanics of these models, we aim to develop a deep, nuanced understanding of their functioning and limitations. This dual focus allows us to

  • Evaluate the reasoning capabilities of AI models, ensuring that they align with human expectations and can perform tasks effectively
  • Investigate the practical applications of AI in domains like healthcare, where trust and transparency are paramount, and human-computer interaction, where user experience and model reliability are essential.

Our current research areas include:

  • Benchmarking AI reasoning abilities. Current benchmarks often fail to capture the complexities of real-world reasoning. We design new evaluation methods and datasets to measure AI performance across logical, mathematical and common sense reasoning tasks.
  • Understanding failure modes in LLMs. Despite their impressive capabilities, LLMs struggle with consistency, logical coherence and robustness in reasoning-heavy tasks. We analyse these failure points and develop strategies to mitigate them.
  • Human-AI reasoning interactions. AI models increasingly assist humans in decision-making, but their reasoning processes are often opaque. We study how humans interpret and rely on AI-generated reasoning, aiming to make these interactions more transparent and trustworthy.
  • Dynamic evaluation frameworks. As AI systems evolve, traditional benchmarks quickly become obsolete. We explore new methodologies, such as dynamic benchmarking and adversarial testing, to 2 of 3 ensure AI reasoning is assessed in a rigorous and meaningful way.

We cultivate a collaborative research environment that encourages innovation and learning for postgraduate students. Through peer learning, code sharing, and hands-on training in data science and STEM fields, we accelerate their learning, helping them to acquire critical skills and engage with cutting-edge research in a dynamic, team-oriented setting.

Our impact

Our work is shaping the future of AI by ensuring that large language models are not only powerful but also trustworthy and understandable. Our research supports the development of AI systems that are capable of more accurate reasoning, fairer interactions, and greater utility in high-stakes environments such as healthcare.

At the same time, our commitment to fostering an inclusive and collaborative research environment ensures that the next generation of researchers is well-prepared to tackle the challenges of AI. By mentoring students and creating a space for knowledge-sharing and skill development, we are building a strong foundation for future AI research and innovation.

Our team

Visting team members

Related Topics:

Privacy Overview
Oxford Internet Institute

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

Strictly Necessary Cookies
  • moove_gdrp_popup -  a cookie that saves your preferences for cookie settings. Without this cookie, the screen offering you cookie options will appear on every page you visit.

This cookie remains on your computer for 365 days, but you can adjust your preferences at any time by clicking on the "Cookie settings" link in the website footer.

Please note that if you visit the Oxford University website, any cookies you accept there will appear on our site here too, this being a subdomain. To control them, you must change your cookie preferences on the main University website.

Google Analytics

This website uses Google Tags and Google Analytics to collect anonymised information such as the number of visitors to the site, and the most popular pages. Keeping these cookies enabled helps the OII improve our website.

Enabling this option will allow cookies from:

  • Google Analytics - tracking visits to the ox.ac.uk and oii.ox.ac.uk domains

These cookies will remain on your website for 365 days, but you can edit your cookie preferences at any time via the "Cookie Settings" button in the website footer.