Introducing Deep Openreview Research For Efficient Discovery And Analysis Of Conference Papers

Deep OpenReview Research

What You’ll Learn

How to find truly relevant papers from thousands of accepted papers
An overview of “deep paper analysis” that goes beyond traditional keyword search
Concrete examples of AI-powered paper evaluation and review information utilization

The Problem: Missing Important Research in a Sea of Papers

Major AI conferences accept an enormous number of papers each year:

NeurIPS 2025: 5,290 papers
ICLR 2025: 3,704 papers
ICML 2025: 3,260 papers

These papers represent months or years of work by researchers worldwide, yet most papers remain unread and buried in reality.

Limitations of Traditional Keyword Search:

Researcher A is interested in “graph generation,” but when an important paper is titled “latent space controlled molecular graph synthesis,” simple keyword search may fail to find it.

Furthermore, OpenReview contains valuable information such as Meta Reviews and Decision Comments for each paper, but conventional search tools cannot fully utilize this rich data.

To address this problem, we developed Deep OpenReview Research.

The Solution: Deep OpenReview Research

An AI-powered paper discovery and analysis agent targeting accepted papers on OpenReview. It combines OpenReview API with LLMs to automatically discover papers related to your research interests, rank them, and generate comprehensive reports.

The Three Depths of “Deep”

1. Semantic Search

Traditional: Exact keyword matching
Deep OpenReview Research: LLM automatically generates synonyms (e.g., “graph generation” → “molecular graph synthesis” and dozens of related expressions), discovering papers missed due to terminology variations

2. Deep Review Analysis

Traditional: Only titles and abstracts
Deep OpenReview Research: Analyzes Meta Reviews, Decision Comments, review scores, and author responses to understand why papers were accepted

3. Multi-axis Evaluation

Traditional: Evaluation based solely on relevance
Deep OpenReview Research: Comprehensive evaluation across 4 dimensions (relevance, novelty, impact, practicality), enabling prioritization based on research objectives

Key Features

1. Natural Language Research Interest Specification

Describe your research interests in natural language rather than keyword lists.

python run_deep_research.py \
  --venue NeurIPS --year 2025 \
  --research-description "I'm interested in graph generation and its applications to drug discovery"

2. 4-Axis LLM Evaluation

Evaluates papers from 4 perspectives in a single LLM call: relevance, novelty, impact, and practicality. This helps distinguish between “interesting but impractical papers” and “implementable but less novel papers.”

3. Full Utilization of OpenReview Information

Analyzes Meta Reviews, Decision Comments, review scores, author responses, and presentation formats (Oral/Spotlight/Poster) to understand why papers were accepted.

4. Automatic Report Generation

Outputs all analysis results as Markdown-formatted reports that can be used as research notes.

Use Cases

Case 1: Molecular Graph Generation Research

python run_deep_research.py \
  --venue NeurIPS --year 2025 \
  --research-description "I'm interested in molecular graph generation and drug discovery applications"

Result: Extracts top 100 relevant papers from 5,290, performs LLM evaluation, and generates a ranked report of the top 50 most relevant papers.

Case 2: LLM Efficiency Techniques Survey

python run_deep_research.py \
  --venue NeurIPS --year 2025 \
  --research-description "LLM quantization and inference acceleration techniques"

Result: Prioritizes practical, implementable technical papers through practicality-focused evaluation.

Processing Flow

Extract research keywords from natural language
LLM automatically generates synonyms
Search papers using OpenReview API
Initial filtering by keyword matching
Multi-axis LLM evaluation of top k papers (default: 100)
Rank by final score and generate report

Example Output Report

# [Rank 1] MolGen: Controllable Molecular Graph Generation

**Scores**
- Final Score: 0.892
- Relevance: 0.950 | Novelty: 0.850 | Impact: 0.825 | Practicality: 0.850
- Average Review Score: 8.2/10

**AI Evaluation**
High relevance to both graph generation and drug discovery applications. 
Proposes a novel approach for controllable molecular generation using 
diffusion models...

**Decision Comment**
Significant contribution to molecular design. Oral presentation.

**Presentation Format**: Oral Presentation (top-tier paper)

Tech Stack

Python 3.12+, LangGraph/LangChain, OpenAI GPT-4, OpenReview API

Getting Started

git clone https://github.com/tb-yasu/deep-openreview-research.git
cd deep-openreview-research
python -m venv venvsource venv/bin/activate
pip install -r requirements.txt

# Set OpenAI API key in .env file
cp .env.example .env

# Fetch paper data (first time only, approximately 60-90 minutes depending on environment)
python fetch_all_papers.py --venue NeurIPS --year 2025

# Run
python run_deep_research.py \
  --venue NeurIPS --year 2025 \
  --research-description "Your research interests"

For detailed configuration options, see the GitHub repository (English).

Japanese version repository is available here.

Comparison with Other Tools

Deep OpenReview Research differentiates itself through the following features:

✓ Unique to Deep OpenReview Research

Automatic synonym generation
Review information analysis (Meta Review, Decision Comment)
Multi-axis evaluation (4 dimensions: relevance, novelty, impact, practicality)
Acceptance rationale analysis
Custom report generation

△ Semantic Scholar

Only simple metrics (e.g., Highly Influential Citations)

× Google Scholar / Semantic Scholar

Do not support the above features

The key strength of Deep OpenReview Research is its full utilization of the OpenReview API, analyzing deep information from the acceptance process such as Meta Reviews and Decision Comments.

Application Scenarios

Literature Review: Streamline related work surveys for paper writing
Technology Survey: Discover papers emphasizing implementation feasibility
Research Trend Analysis: Cross-year and cross-conference investigations
Paper Reading Group Selection: Select papers based on presentation format (Oral/Spotlight)

FAQ

Q: What are the processing time and cost?

A: For evaluating 100 papers with GPT-4o-mini, depending on prompt design, it is designed to operate in approximately 1-2 minutes at a cost of $0.05-0.1.

Q: Can it be used offline?

A: Paper data is cached locally, but API connection is required for LLM evaluation.

Q: Which conferences are supported?

A: Conferences using OpenReview such as NeurIPS, ICML, and ICLR.

Q: How accurate are the evaluations?

A: This tool is under active development, and quantitative validation of paper evaluation accuracy is a future task. Scoring methods are subject to discussion. We recommend using AI evaluation results as reference information and having human researchers make final judgments on critical research decisions.

Summary

Finding truly relevant papers from thousands of accepted papers each year is challenging. Deep OpenReview Research addresses this problem through three key approaches:

Expanded search scope through automatic synonym generation - Prevents missing papers due to terminology variations
Deep review analysis - Understands why papers were accepted from Meta Reviews and Decision Comments
Multi-axis prioritization - Comprehensive evaluation across relevance, novelty, impact, and practicality

We aim to reduce paper survey time from days to minutes or tens of minutes.

Note: This tool is under active development, and quantitative validation of paper evaluation accuracy is a future task. Scoring methods are subject to discussion. We recommend using AI evaluation results as reference information and having human researchers make final judgments on critical research decisions.