diff --git a/SCIP_RELATIONSHIP_IMPLEMENTATION_PLAN.md b/SCIP_RELATIONSHIP_IMPLEMENTATION_PLAN.md deleted file mode 100644 index cbed682..0000000 --- a/SCIP_RELATIONSHIP_IMPLEMENTATION_PLAN.md +++ /dev/null @@ -1,284 +0,0 @@ -# SCIP 關係圖實施計畫 - -**版本**: 1.0 -**日期**: 2025-01-14 -**狀態**: 規劃階段 - -## 📋 問題分析 - -### 當前狀況 -- ✅ SCIP Protocol Buffer 結構完整實現 -- ✅ 符號定義和出現位置正確處理 -- ❌ **關鍵缺失**: SCIP Relationship 功能完全未實現 -- ❌ 內部 `CallRelationships` 與標準 SCIP `Relationship` 完全分離 - -### 影響評估 -- **合規性**: 目前僅 60-70% 符合 SCIP 標準 -- **功能性**: 關係圖和跨符號導航功能不可用 -- **兼容性**: 無法與標準 SCIP 工具鏈集成 - -## 🎯 目標 - -### 主要目標 -1. **100% SCIP 標準合規性**: 完整實現 `scip_pb2.Relationship` 支援 -2. **關係圖功能**: 啟用函數調用、繼承、實現等關係追蹤 -3. **多語言支援**: 6 種程式語言的關係提取 -4. **向後兼容**: 不破壞現有功能 - -### 成功指標 -- ✅ 所有符號包含正確的 SCIP Relationship 信息 -- ✅ 通過官方 SCIP 驗證工具檢查 -- ✅ 關係查詢 API 正常運作 -- ✅ 性能影響 < 20% - -## 🏗️ 技術架構 - -### 當前架構問題 -``` -[CallRelationships (內部格式)] ❌ 斷層 ❌ [SCIP Relationship (標準格式)] -``` - -### 目標架構 -``` -[程式碼分析] → [關係提取] → [關係管理器] → [SCIP Relationship] → [SymbolInformation] -``` - -### 核心組件 - -#### 1. 關係管理器 (`relationship_manager.py`) -```python -class SCIPRelationshipManager: - """SCIP 關係轉換和管理核心""" - - def create_relationship(self, target_symbol: str, relationship_type: RelationshipType) -> scip_pb2.Relationship - def add_relationships_to_symbol(self, symbol_info: scip_pb2.SymbolInformation, relationships: List[Relationship]) - def convert_call_relationships(self, call_rels: CallRelationships) -> List[scip_pb2.Relationship] -``` - -#### 2. 關係類型定義 (`relationship_types.py`) -```python -class RelationshipType(Enum): - CALLS = "calls" # 函數調用關係 - INHERITS = "inherits" # 繼承關係 - IMPLEMENTS = "implements" # 實現關係 - REFERENCES = "references" # 引用關係 - TYPE_DEFINITION = "type_definition" # 類型定義關係 -``` - -## 📁 檔案修改計畫 - -### 🆕 新增檔案 (4個) - -#### 核心組件 -``` -src/code_index_mcp/scip/core/ -├── relationship_manager.py # 關係轉換核心 (優先級 1) -└── relationship_types.py # 關係類型定義 (優先級 1) -``` - -#### 測試檔案 -``` -tests/ -├── scip/test_relationship_manager.py # 單元測試 (優先級 3) -└── integration/test_scip_relationships.py # 整合測試 (優先級 3) -``` - -### 🔄 修改現有檔案 (9個) - -#### 核心系統 -``` -src/code_index_mcp/scip/core/ -└── local_reference_resolver.py # 關係存儲和查詢 (優先級 1) -``` - -#### 策略層 -``` -src/code_index_mcp/scip/strategies/ -├── base_strategy.py # 基礎關係處理 (優先級 1) -├── python_strategy.py # Python 關係提取 (優先級 2) -├── javascript_strategy.py # JavaScript 關係提取 (優先級 2) -├── java_strategy.py # Java 關係提取 (優先級 2) -├── objective_c_strategy.py # Objective-C 關係提取 (優先級 2) -├── zig_strategy.py # Zig 關係提取 (優先級 2) -└── fallback_strategy.py # 後備關係處理 (優先級 2) -``` - -#### 分析工具 -``` -src/code_index_mcp/tools/scip/ -├── symbol_definitions.py # 關係數據結構增強 (優先級 2) -└── scip_symbol_analyzer.py # 關係分析整合 (優先級 2) -``` - -## 🗓️ 實施時程 - -### 階段 1:核心基礎 (第1-2週) - 優先級 1 -- [ ] **Week 1.1**: 創建 `relationship_manager.py` - - SCIP Relationship 創建和轉換邏輯 - - 關係類型映射功能 - - 基礎 API 設計 - -- [ ] **Week 1.2**: 創建 `relationship_types.py` - - 內部關係類型枚舉定義 - - SCIP 標準關係映射 - - 關係驗證邏輯 - -- [ ] **Week 2.1**: 修改 `base_strategy.py` - - 新增 `_create_scip_relationships` 方法 - - 修改 `_create_scip_symbol_information` 加入關係處理 - - 新增抽象方法 `_build_symbol_relationships` - -- [ ] **Week 2.2**: 更新 `local_reference_resolver.py` - - 新增關係存儲功能 - - 實現 `add_symbol_relationship` 方法 - - 實現 `get_symbol_relationships` 方法 - -### 階段 2:語言實現 (第3-4週) - 優先級 2 - -#### Week 3: 主要語言策略 -- [ ] **Week 3.1**: Python 策略 (`python_strategy.py`) - - 函數調用關係提取 - - 類繼承關係檢測 - - 方法重寫關係處理 - -- [ ] **Week 3.2**: JavaScript 策略 (`javascript_strategy.py`) - - 函數調用和原型鏈關係 - - ES6 類繼承關係 - - 模組導入關係 - -#### Week 4: 其他語言策略 -- [ ] **Week 4.1**: Java 策略 (`java_strategy.py`) - - 類繼承和介面實現關係 - - 方法調用關係 - - 包導入關係 - -- [ ] **Week 4.2**: Objective-C 和 Zig 策略 - - Objective-C 協議和繼承關係 - - Zig 結構體和函數關係 - - 後備策略更新 - -- [ ] **Week 4.3**: 工具層更新 - - 更新 `symbol_definitions.py` - - 整合 `scip_symbol_analyzer.py` - -### 階段 3:測試驗證 (第5週) - 優先級 3 -- [ ] **Week 5.1**: 單元測試 - - 關係管理器測試 - - 關係類型轉換測試 - - 各語言策略關係提取測試 - -- [ ] **Week 5.2**: 整合測試 - - 端到端關係功能測試 - - 多語言項目關係測試 - - 性能回歸測試 - -- [ ] **Week 5.3**: SCIP 合規性驗證 - - 使用官方 SCIP 工具驗證 - - 關係格式正確性檢查 - - 兼容性測試 - -### 階段 4:優化完善 (第6週) - 優先級 4 -- [ ] **Week 6.1**: 性能優化 - - 關係查詢 API 優化 - - 記憶體使用優化 - - 大型項目支援測試 - -- [ ] **Week 6.2**: 文檔和工具 - - 更新 ARCHITECTURE.md - - 更新 API 文檔 - - 使用範例和指南 - -- [ ] **Week 6.3**: 發布準備 - - 版本號更新 - - 變更日誌準備 - - 向後兼容性最終檢查 - -## 🧪 測試策略 - -### 單元測試範圍 -```python -# test_relationship_manager.py -def test_create_scip_relationship() -def test_convert_call_relationships() -def test_relationship_type_mapping() - -# test_python_relationships.py -def test_function_call_extraction() -def test_class_inheritance_detection() -def test_method_override_relationships() -``` - -### 整合測試範圍 -```python -# test_scip_relationships.py -def test_end_to_end_relationship_flow() -def test_multi_language_relationship_support() -def test_cross_file_relationship_resolution() -def test_scip_compliance_validation() -``` - -### 測試數據 -- 使用現有 `test/sample-projects/` 中的範例項目 -- 新增特定關係測試案例 -- 包含邊界情況和錯誤處理測試 - -## 📊 風險評估與緩解 - -### 高風險項目 -1. **性能影響**: 關係處理可能影響索引速度 - - **緩解**: 增量關係更新、並行處理 - -2. **複雜度增加**: 多語言關係邏輯複雜 - - **緩解**: 分階段實施、詳細測試 - -3. **向後兼容**: 現有 API 可能受影響 - - **緩解**: 保持現有接口、漸進式更新 - -### 中風險項目 -1. **SCIP 標準理解**: 關係映射可能不精確 - - **緩解**: 參考官方實現、社群驗證 - -2. **語言特性差異**: 不同語言關係模型差異大 - - **緩解**: 分語言設計、彈性架構 - -## 🚀 預期成果 - -### 功能改進 -- ✅ 完整的 SCIP 關係圖支援 -- ✅ 跨文件符號導航功能 -- ✅ 與標準 SCIP 工具鏈兼容 -- ✅ 6 種程式語言的關係分析 - -### 合規性提升 -- **當前**: 60-70% SCIP 標準合規 -- **目標**: 95%+ SCIP 標準合規 -- **關鍵**: 100% Relationship 功能合規 - -### 性能目標 -- 索引速度影響 < 15% -- 記憶體使用增長 < 20% -- 大型項目 (1000+ 檔案) 支援良好 - -## 📝 變更管理 - -### 版本控制策略 -- 功能分支開發 (`feature/scip-relationships`) -- 增量 PR 提交,便於審查 -- 完整功能後合併到主分支 - -### 文檔更新 -- [ ] 更新 `ARCHITECTURE.md` 包含關係架構 -- [ ] 更新 `README.md` 功能描述 -- [ ] 新增關係 API 使用指南 -- [ ] 更新 `SCIP_OFFICIAL_STANDARDS.md` 實現狀態 - -### 發布策略 -- 作為主要版本發布 (v3.0.0) -- 提供升級指南和遷移文檔 -- 社群通知和反饋收集 - ---- - -**負責人**: Claude Code -**審查者**: 項目維護者 -**最後更新**: 2025-01-14 \ No newline at end of file diff --git a/diff.txt b/diff.txt deleted file mode 100644 index 4a804fd..0000000 Binary files a/diff.txt and /dev/null differ diff --git a/parse_scip_index.py b/parse_scip_index.py deleted file mode 100644 index e553561..0000000 --- a/parse_scip_index.py +++ /dev/null @@ -1,95 +0,0 @@ -#!/usr/bin/env python3 -"""解析 SCIP 索引文件""" - -import sys -import os -sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'src')) - -def parse_scip_index(): - """解析 SCIP 索引文件""" - - scip_file_path = r"C:\Users\P10362~1\AppData\Local\Temp\code_indexer\22bf459212636f4b8ae327f69d901283\index.scip" - - try: - from code_index_mcp.scip.proto import scip_pb2 - - print(f"🔍 解析 SCIP 文件: {scip_file_path}") - - # 檢查文件是否存在 - if not os.path.exists(scip_file_path): - print("❌ SCIP 文件不存在") - return - - # 獲取文件大小 - file_size = os.path.getsize(scip_file_path) - print(f"📊 文件大小: {file_size} bytes") - - # 讀取並解析 SCIP 文件 - with open(scip_file_path, 'rb') as f: - scip_data = f.read() - - print(f"✅ 讀取了 {len(scip_data)} bytes 的數據") - - # 解析 protobuf - scip_index = scip_pb2.Index() - scip_index.ParseFromString(scip_data) - - print(f"✅ SCIP 索引解析成功") - print(f"📄 文檔數量: {len(scip_index.documents)}") - - # 檢查元數據 - if scip_index.metadata: - print(f"📋 元數據:") - print(f" 版本: {scip_index.metadata.version}") - print(f" 項目根目錄: {scip_index.metadata.project_root}") - print(f" 工具信息: {scip_index.metadata.tool_info}") - - # 檢查前幾個文檔 - for i, doc in enumerate(scip_index.documents[:5]): - print(f"\n📄 文檔 {i+1}: {doc.relative_path}") - print(f" 語言: {doc.language}") - print(f" 符號數量: {len(doc.symbols)}") - print(f" 出現次數: {len(doc.occurrences)}") - - # 檢查符號 - for j, symbol in enumerate(doc.symbols[:3]): - print(f" 🔍 符號 {j+1}: {symbol.display_name}") - print(f" 符號 ID: {symbol.symbol}") - print(f" 類型: {symbol.kind}") - print(f" 關係數量: {len(symbol.relationships)}") - - # 檢查關係 - if symbol.relationships: - for k, rel in enumerate(symbol.relationships[:2]): - print(f" 🔗 關係 {k+1}: -> {rel.symbol}") - print(f" is_reference: {rel.is_reference}") - print(f" is_implementation: {rel.is_implementation}") - print(f" is_type_definition: {rel.is_type_definition}") - - # 統計信息 - total_symbols = sum(len(doc.symbols) for doc in scip_index.documents) - total_occurrences = sum(len(doc.occurrences) for doc in scip_index.documents) - total_relationships = sum(len(symbol.relationships) for doc in scip_index.documents for symbol in doc.symbols) - - print(f"\n📊 統計信息:") - print(f" 總文檔數: {len(scip_index.documents)}") - print(f" 總符號數: {total_symbols}") - print(f" 總出現次數: {total_occurrences}") - print(f" 總關係數: {total_relationships}") - - return True - - except Exception as e: - print(f"❌ 解析失敗: {e}") - import traceback - traceback.print_exc() - return False - -if __name__ == "__main__": - print("🚀 開始解析 SCIP 索引文件...") - success = parse_scip_index() - - if success: - print("\n✅ SCIP 索引解析完成!") - else: - print("\n❌ SCIP 索引解析失敗") \ No newline at end of file diff --git a/pyproject.toml b/pyproject.toml index 0702c16..2c0d989 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta" [project] name = "code-index-mcp" -version = "2.1.1" +version = "2.1.2" description = "Code indexing and analysis tools for LLMs using MCP" readme = "README.md" requires-python = ">=3.10" @@ -20,9 +20,9 @@ dependencies = [ "tree-sitter-javascript>=0.20.0", "tree-sitter-typescript>=0.20.0", "tree-sitter-java>=0.20.0", - "tree-sitter-c>=0.20.0", "tree-sitter-zig>=0.20.0", "pathspec>=0.12.1", + "libclang>=16.0.0", ] [project.urls] diff --git a/requirements.txt b/requirements.txt index c90f59f..1a80b2f 100644 --- a/requirements.txt +++ b/requirements.txt @@ -5,5 +5,6 @@ tree-sitter>=0.20.0 tree-sitter-javascript>=0.20.0 tree-sitter-typescript>=0.20.0 tree-sitter-java>=0.20.0 -tree-sitter-c>=0.20.0 +tree-sitter-zig>=0.20.0 pathspec>=0.12.1 +libclang>=16.0.0 diff --git a/src/code_index_mcp/scip/factory.py b/src/code_index_mcp/scip/factory.py index 5f750d4..1620d8b 100644 --- a/src/code_index_mcp/scip/factory.py +++ b/src/code_index_mcp/scip/factory.py @@ -7,7 +7,13 @@ from .strategies.javascript_strategy import JavaScriptStrategy from .strategies.java_strategy import JavaStrategy from .strategies.objective_c_strategy import ObjectiveCStrategy -from .strategies.zig_strategy import ZigStrategy +# Optional strategies - import only if available +try: + from .strategies.zig_strategy import ZigStrategy + ZIG_AVAILABLE = True +except ImportError: + ZigStrategy = None + ZIG_AVAILABLE = False from .strategies.fallback_strategy import FallbackStrategy from ..constants import SUPPORTED_EXTENSIONS @@ -35,9 +41,12 @@ def _register_all_strategies(self): (JavaScriptStrategy, 95), (JavaStrategy, 95), (ObjectiveCStrategy, 95), - (ZigStrategy, 95), ] + # Add optional strategies if available + if ZIG_AVAILABLE and ZigStrategy: + strategy_classes.append((ZigStrategy, 95)) + for strategy_class, priority in strategy_classes: try: strategy = strategy_class(priority=priority) @@ -127,7 +136,7 @@ def list_supported_extensions(self) -> Set[str]: supported.update({'.java'}) elif isinstance(strategy, ObjectiveCStrategy): supported.update({'.m', '.mm'}) - elif isinstance(strategy, ZigStrategy): + elif ZIG_AVAILABLE and isinstance(strategy, ZigStrategy): supported.update({'.zig', '.zon'}) elif isinstance(strategy, FallbackStrategy): # Fallback supports everything, but we don't want to list everything here diff --git a/src/code_index_mcp/scip/strategies/javascript_strategy.py b/src/code_index_mcp/scip/strategies/javascript_strategy.py index ab5104d..489fd37 100644 --- a/src/code_index_mcp/scip/strategies/javascript_strategy.py +++ b/src/code_index_mcp/scip/strategies/javascript_strategy.py @@ -14,6 +14,7 @@ from tree_sitter_javascript import language as js_language from tree_sitter_typescript import language_typescript as ts_language + logger = logging.getLogger(__name__) @@ -27,12 +28,26 @@ def __init__(self, priority: int = 95): super().__init__(priority) # Initialize parsers - js_lang = tree_sitter.Language(js_language()) - ts_lang = tree_sitter.Language(ts_language()) - - self.js_parser = tree_sitter.Parser(js_lang) - self.ts_parser = tree_sitter.Parser(ts_lang) - logger.info("JavaScript strategy initialized with Tree-sitter support") + try: + js_lang = tree_sitter.Language(js_language()) + ts_lang = tree_sitter.Language(ts_language()) + + self.js_parser = tree_sitter.Parser(js_lang) + self.ts_parser = tree_sitter.Parser(ts_lang) + logger.info("JavaScript strategy initialized with Tree-sitter support") + except Exception as e: + logger.error(f"Failed to initialize JavaScript strategy: {e}") + self.js_parser = None + self.ts_parser = None + + # Initialize dependency tracking + self.dependencies = { + 'imports': { + 'standard_library': [], + 'third_party': [], + 'local': [] + } + } def can_handle(self, extension: str, file_path: str) -> bool: """Check if this strategy can handle the file type.""" @@ -40,7 +55,7 @@ def can_handle(self, extension: str, file_path: str) -> bool: def get_language_name(self) -> str: """Get the language name for SCIP symbol generation.""" - return "javascript" # Use 'javascript' for both JS and TS + return "javascript" def is_available(self) -> bool: """Check if this strategy is available.""" @@ -59,7 +74,7 @@ def _collect_symbol_definitions(self, files: List[str], project_path: str) -> No self._collect_symbols_from_file(file_path, project_path) processed_count += 1 - if i % 10 == 0 or i == len(files): + if i % 10 == 0 or i == len(files): # Progress every 10 files or at end logger.debug(f"Phase 1 progress: {i}/{len(files)} files, last file: {relative_path}") except Exception as e: @@ -82,14 +97,14 @@ def _generate_documents_with_references(self, files: List[str], project_path: st relative_path = os.path.relpath(file_path, project_path) try: - document = self._analyze_js_file(file_path, project_path, relationships) + document = self._analyze_javascript_file(file_path, project_path, relationships) if document: documents.append(document) total_occurrences += len(document.occurrences) total_symbols += len(document.symbols) processed_count += 1 - if i % 10 == 0 or i == len(files): + if i % 10 == 0 or i == len(files): # Progress every 10 files or at end logger.debug(f"Phase 2 progress: {i}/{len(files)} files, " f"last file: {relative_path}, " f"{len(document.occurrences) if document else 0} occurrences") @@ -121,7 +136,7 @@ def _build_symbol_relationships(self, files: List[str], project_path: str) -> Di for file_path in files: try: - file_relationships = self._extract_js_relationships_from_file(file_path, project_path) + file_relationships = self._extract_relationships_from_file(file_path, project_path) all_relationships.update(file_relationships) except Exception as e: logger.warning(f"Failed to extract relationships from {file_path}: {e}") @@ -135,6 +150,9 @@ def _build_symbol_relationships(self, files: List[str], project_path: str) -> Di def _collect_symbols_from_file(self, file_path: str, project_path: str) -> None: """Collect symbol definitions from a single JavaScript/TypeScript file.""" + # Reset dependencies for this file + self._reset_dependencies() + # Read file content content = self._read_file_content(file_path) if not content: @@ -147,14 +165,15 @@ def _collect_symbols_from_file(self, file_path: str, project_path: str) -> None: if not tree or not tree.root_node: raise StrategyError(f"Failed to parse {os.path.relpath(file_path, project_path)}") except Exception as e: - raise StrategyError(f"Parse error in {os.path.relpath(file_path, project_path)}: {e}") + logger.warning(f"Parse error in {os.path.relpath(file_path, project_path)}: {e}") + return - # Collect symbols using Tree-sitter + # Collect symbols using integrated visitor relative_path = self._get_relative_path(file_path, project_path) self._collect_symbols_from_tree(tree, relative_path, content) logger.debug(f"Symbol collection - {relative_path}") - def _analyze_js_file(self, file_path: str, project_path: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> Optional[scip_pb2.Document]: + def _analyze_javascript_file(self, file_path: str, project_path: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> Optional[scip_pb2.Document]: """Analyze a single JavaScript/TypeScript file and generate complete SCIP document.""" relative_path = self._get_relative_path(file_path, project_path) @@ -170,7 +189,8 @@ def _analyze_js_file(self, file_path: str, project_path: str, relationships: Opt if not tree or not tree.root_node: raise StrategyError(f"Failed to parse {relative_path}") except Exception as e: - raise StrategyError(f"Parse error in {relative_path}: {e}") + logger.warning(f"Parse error in {relative_path}: {e}") + return None # Create SCIP document document = scip_pb2.Document() @@ -179,6 +199,7 @@ def _analyze_js_file(self, file_path: str, project_path: str, relationships: Opt # Analyze tree and generate occurrences self.position_calculator = PositionCalculator(content) + occurrences, symbols = self._analyze_tree_for_document(tree, relative_path, content, relationships) # Add results to document @@ -190,7 +211,7 @@ def _analyze_js_file(self, file_path: str, project_path: str, relationships: Opt return document - def _extract_js_relationships_from_file(self, file_path: str, project_path: str) -> Dict[str, List[tuple]]: + def _extract_relationships_from_file(self, file_path: str, project_path: str) -> Dict[str, List[tuple]]: """ Extract relationships from a single JavaScript/TypeScript file. @@ -210,7 +231,8 @@ def _extract_js_relationships_from_file(self, file_path: str, project_path: str) if not tree or not tree.root_node: raise StrategyError(f"Failed to parse {file_path} for relationship extraction") except Exception as e: - raise StrategyError(f"Parse error in {file_path}: {e}") + logger.warning(f"Parse error in {file_path}: {e}") + return {} return self._extract_relationships_from_tree(tree, file_path, project_path) @@ -224,43 +246,83 @@ def _parse_js_content(self, content: str, file_path: str): else: parser = self.js_parser + if not parser: + raise StrategyError(f"No parser available for {extension}") + content_bytes = content.encode('utf-8') return parser.parse(content_bytes) - def _collect_symbols_from_tree(self, tree, file_path: str, content: str) -> None: - """Collect symbols from Tree-sitter tree.""" - - def visit_node(node, scope_stack=[]): + """Collect symbols from Tree-sitter tree using integrated visitor.""" + # Use a set to track processed nodes and avoid duplicates + self._processed_nodes = set() + scope_stack = [] + + def visit_node(node, current_scope_stack=None): + if current_scope_stack is None: + current_scope_stack = scope_stack[:] + + # Skip if already processed (by memory address) + node_id = id(node) + if node_id in self._processed_nodes: + return + self._processed_nodes.add(node_id) + node_type = node.type + # Traditional function and class declarations if node_type in ['function_declaration', 'method_definition', 'arrow_function']: - self._register_js_function(node, file_path, scope_stack) + name = self._get_js_function_name(node) + if name: + self._register_function_symbol(node, name, file_path, current_scope_stack) elif node_type in ['class_declaration']: - self._register_js_class(node, file_path, scope_stack) - + name = self._get_js_class_name(node) + if name: + self._register_class_symbol(node, name, file_path, current_scope_stack) + + # Assignment expressions with function expressions (obj.method = function() {}) + elif node_type == 'assignment_expression': + self._handle_assignment_expression(node, file_path, current_scope_stack) + + # Lexical declarations (const, let, var) + elif node_type == 'lexical_declaration': + self._handle_lexical_declaration(node, file_path, current_scope_stack) + + # Expression statements (might contain method chains) + elif node_type == 'expression_statement': + self._handle_expression_statement(node, file_path, current_scope_stack) + # Recursively visit children for child in node.children: - visit_node(child, scope_stack) + visit_node(child, current_scope_stack) visit_node(tree.root_node) - def _analyze_tree_for_document(self, tree, file_path: str, content: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> tuple: + def _analyze_tree_for_document(self, tree, file_path: str, content: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> tuple[List[scip_pb2.Occurrence], List[scip_pb2.SymbolInformation]]: """Analyze Tree-sitter tree to generate occurrences and symbols for SCIP document.""" occurrences = [] symbols = [] + scope_stack = [] + + # Use the same processed nodes set to avoid duplicates + if not hasattr(self, '_processed_nodes'): + self._processed_nodes = set() - def visit_node(node, scope_stack=[]): + def visit_node(node, current_scope_stack=None): + if current_scope_stack is None: + current_scope_stack = scope_stack[:] + node_type = node.type + # Traditional function and class declarations if node_type in ['function_declaration', 'method_definition', 'arrow_function']: name = self._get_js_function_name(node) if name: - symbol_id = self._create_js_function_symbol_id(name, file_path, scope_stack) - occurrence = self._create_js_function_occurrence(node, symbol_id) + symbol_id = self._create_function_symbol_id(name, file_path, current_scope_stack) + occurrence = self._create_function_occurrence(node, symbol_id) symbol_relationships = relationships.get(symbol_id, []) if relationships else [] scip_relationships = self._create_scip_relationships(symbol_relationships) if symbol_relationships else [] - symbol_info = self._create_js_function_symbol_info(node, symbol_id, name, scip_relationships) + symbol_info = self._create_function_symbol_info(node, symbol_id, name, scip_relationships) if occurrence: occurrences.append(occurrence) @@ -270,20 +332,38 @@ def visit_node(node, scope_stack=[]): elif node_type in ['class_declaration']: name = self._get_js_class_name(node) if name: - symbol_id = self._create_js_class_symbol_id(name, file_path, scope_stack) - occurrence = self._create_js_class_occurrence(node, symbol_id) + symbol_id = self._create_class_symbol_id(name, file_path, current_scope_stack) + occurrence = self._create_class_occurrence(node, symbol_id) symbol_relationships = relationships.get(symbol_id, []) if relationships else [] scip_relationships = self._create_scip_relationships(symbol_relationships) if symbol_relationships else [] - symbol_info = self._create_js_class_symbol_info(node, symbol_id, name, scip_relationships) + symbol_info = self._create_class_symbol_info(node, symbol_id, name, scip_relationships) if occurrence: occurrences.append(occurrence) if symbol_info: symbols.append(symbol_info) + + # Assignment expressions with function expressions + elif node_type == 'assignment_expression': + occurrence, symbol_info = self._handle_assignment_for_document(node, file_path, current_scope_stack, relationships) + if occurrence: + occurrences.append(occurrence) + if symbol_info: + symbols.append(symbol_info) + + # Lexical declarations + elif node_type == 'lexical_declaration': + document_symbols = self._handle_lexical_for_document(node, file_path, current_scope_stack, relationships) + for occ, sym in document_symbols: + if occ: + occurrences.append(occ) + if sym: + symbols.append(sym) - # Recursively visit children - for child in node.children: - visit_node(child, scope_stack) + # Recursively visit children only if not in assignment or lexical that we handle above + if node_type not in ['assignment_expression', 'lexical_declaration']: + for child in node.children: + visit_node(child, current_scope_stack) visit_node(tree.root_node) return occurrences, symbols @@ -291,15 +371,20 @@ def visit_node(node, scope_stack=[]): def _extract_relationships_from_tree(self, tree, file_path: str, project_path: str) -> Dict[str, List[tuple]]: """Extract relationships from Tree-sitter tree.""" relationships = {} + scope_stack = [] + relative_path = self._get_relative_path(file_path, project_path) - def visit_node(node, scope_stack=[]): + def visit_node(node, current_scope_stack=None): + if current_scope_stack is None: + current_scope_stack = scope_stack[:] + node_type = node.type if node_type == 'class_declaration': - # Extract inheritance relationships for ES6 classes + # Extract inheritance relationships class_name = self._get_js_class_name(node) if class_name: - class_symbol_id = self._create_js_class_symbol_id(class_name, file_path, scope_stack) + class_symbol_id = self._create_class_symbol_id(class_name, relative_path, current_scope_stack) # Look for extends clause for child in node.children: @@ -308,7 +393,7 @@ def visit_node(node, scope_stack=[]): if heritage_child.type == 'identifier': parent_name = self._get_node_text(heritage_child) if parent_name: - parent_symbol_id = self._create_js_class_symbol_id(parent_name, file_path, scope_stack) + parent_symbol_id = self._create_class_symbol_id(parent_name, relative_path, current_scope_stack) if class_symbol_id not in relationships: relationships[class_symbol_id] = [] relationships[class_symbol_id].append((parent_symbol_id, InternalRelationshipType.INHERITS)) @@ -317,14 +402,14 @@ def visit_node(node, scope_stack=[]): # Extract function call relationships function_name = self._get_js_function_name(node) if function_name: - function_symbol_id = self._create_js_function_symbol_id(function_name, file_path, scope_stack) + function_symbol_id = self._create_function_symbol_id(function_name, relative_path, current_scope_stack) # Find call expressions within this function - self._extract_calls_from_node(node, function_symbol_id, relationships, file_path, scope_stack) + self._extract_calls_from_node(node, function_symbol_id, relationships, relative_path, current_scope_stack) # Recursively visit children for child in node.children: - visit_node(child, scope_stack) + visit_node(child, current_scope_stack) visit_node(tree.root_node) return relationships @@ -340,7 +425,7 @@ def visit_for_calls(n): if function_node.type == 'identifier': target_name = self._get_node_text(function_node) if target_name: - target_symbol_id = self._create_js_function_symbol_id(target_name, file_path, scope_stack) + target_symbol_id = self._create_function_symbol_id(target_name, file_path, scope_stack) if source_symbol_id not in relationships: relationships[source_symbol_id] = [] relationships[source_symbol_id].append((target_symbol_id, InternalRelationshipType.CALLS)) @@ -374,19 +459,10 @@ def _get_js_class_name(self, node) -> Optional[str]: return self._get_node_text(child) return None - # Symbol registration and creation methods - def _register_js_function(self, node, file_path: str, scope_stack: List[str]) -> None: - """Register a JavaScript function symbol definition.""" - name = self._get_js_function_name(node) - if not name: - return - - symbol_id = self.symbol_manager.create_local_symbol( - language="javascript", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="()." - ) + # Helper methods + def _register_function_symbol(self, node, name: str, file_path: str, scope_stack: List[str]) -> None: + """Register a function symbol definition.""" + symbol_id = self._create_function_symbol_id(name, file_path, scope_stack) # Create a dummy range for registration dummy_range = scip_pb2.Range() @@ -402,18 +478,9 @@ def _register_js_function(self, node, file_path: str, scope_stack: List[str]) -> documentation=["JavaScript function"] ) - def _register_js_class(self, node, file_path: str, scope_stack: List[str]) -> None: - """Register a JavaScript class symbol definition.""" - name = self._get_js_class_name(node) - if not name: - return - - symbol_id = self.symbol_manager.create_local_symbol( - language="javascript", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) + def _register_class_symbol(self, node, name: str, file_path: str, scope_stack: List[str]) -> None: + """Register a class symbol definition.""" + symbol_id = self._create_class_symbol_id(name, file_path, scope_stack) # Create a dummy range for registration dummy_range = scip_pb2.Range() @@ -429,35 +496,26 @@ def _register_js_class(self, node, file_path: str, scope_stack: List[str]) -> No documentation=["JavaScript class"] ) - def _create_js_function_symbol_id(self, name: str, file_path: str, scope_stack: List[str]) -> str: - """Create symbol ID for JavaScript function.""" - return self.symbol_manager.create_local_symbol( - language="javascript", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="()." - ) + def _create_function_symbol_id(self, name: str, file_path: str, scope_stack: List[str]) -> str: + """Create symbol ID for function.""" + # SCIP standard: local + local_id = ".".join(scope_stack + [name]) if scope_stack else name + return f"local {local_id}()." - def _create_js_class_symbol_id(self, name: str, file_path: str, scope_stack: List[str]) -> str: - """Create symbol ID for JavaScript class.""" - return self.symbol_manager.create_local_symbol( - language="javascript", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) + def _create_class_symbol_id(self, name: str, file_path: str, scope_stack: List[str]) -> str: + """Create symbol ID for class.""" + # SCIP standard: local + local_id = ".".join(scope_stack + [name]) if scope_stack else name + return f"local {local_id}#" - def _create_js_function_occurrence(self, node, symbol_id: str) -> Optional[scip_pb2.Occurrence]: - """Create SCIP occurrence for JavaScript function.""" + def _create_function_occurrence(self, node, symbol_id: str) -> Optional[scip_pb2.Occurrence]: + """Create SCIP occurrence for function.""" if not self.position_calculator: return None try: - # Convert Tree-sitter node to range (simplified) - range_obj = scip_pb2.Range() - range_obj.start.extend([node.start_point[0], node.start_point[1]]) - range_obj.end.extend([node.end_point[0], node.end_point[1]]) - + # Use Tree-sitter position calculation method + range_obj = self.position_calculator.tree_sitter_node_to_range(node) occurrence = scip_pb2.Occurrence() occurrence.symbol = symbol_id occurrence.symbol_roles = scip_pb2.Definition @@ -467,17 +525,14 @@ def _create_js_function_occurrence(self, node, symbol_id: str) -> Optional[scip_ except: return None - def _create_js_class_occurrence(self, node, symbol_id: str) -> Optional[scip_pb2.Occurrence]: - """Create SCIP occurrence for JavaScript class.""" + def _create_class_occurrence(self, node, symbol_id: str) -> Optional[scip_pb2.Occurrence]: + """Create SCIP occurrence for class.""" if not self.position_calculator: return None try: - # Convert Tree-sitter node to range (simplified) - range_obj = scip_pb2.Range() - range_obj.start.extend([node.start_point[0], node.start_point[1]]) - range_obj.end.extend([node.end_point[0], node.end_point[1]]) - + # Use Tree-sitter position calculation method + range_obj = self.position_calculator.tree_sitter_node_to_range(node) occurrence = scip_pb2.Occurrence() occurrence.symbol = symbol_id occurrence.symbol_roles = scip_pb2.Definition @@ -487,24 +542,433 @@ def _create_js_class_occurrence(self, node, symbol_id: str) -> Optional[scip_pb2 except: return None - def _create_js_function_symbol_info(self, node, symbol_id: str, name: str, relationships: Optional[List[scip_pb2.Relationship]] = None) -> scip_pb2.SymbolInformation: - """Create SCIP symbol information for JavaScript function.""" + def _create_function_symbol_info(self, node, symbol_id: str, name: str, relationships: Optional[List[scip_pb2.Relationship]] = None) -> scip_pb2.SymbolInformation: + """Create SCIP symbol information for function.""" symbol_info = scip_pb2.SymbolInformation() symbol_info.symbol = symbol_id symbol_info.display_name = name symbol_info.kind = scip_pb2.Function + + # Add documentation - check for JSDoc or comments symbol_info.documentation.append("JavaScript function") + + # Add relationships if provided if relationships and self.relationship_manager: self.relationship_manager.add_relationships_to_symbol(symbol_info, relationships) + return symbol_info - def _create_js_class_symbol_info(self, node, symbol_id: str, name: str, relationships: Optional[List[scip_pb2.Relationship]] = None) -> scip_pb2.SymbolInformation: - """Create SCIP symbol information for JavaScript class.""" + def _create_class_symbol_info(self, node, symbol_id: str, name: str, relationships: Optional[List[scip_pb2.Relationship]] = None) -> scip_pb2.SymbolInformation: + """Create SCIP symbol information for class.""" symbol_info = scip_pb2.SymbolInformation() symbol_info.symbol = symbol_id symbol_info.display_name = name symbol_info.kind = scip_pb2.Class + + # Add documentation - check for JSDoc or comments symbol_info.documentation.append("JavaScript class") + + # Add relationships if provided if relationships and self.relationship_manager: self.relationship_manager.add_relationships_to_symbol(symbol_info, relationships) + + return symbol_info + + # JavaScript-specific syntax handlers + def _handle_assignment_expression(self, node, file_path: str, scope_stack: List[str]) -> None: + """Handle assignment expressions like obj.method = function() {}""" + left_child = None + right_child = None + + for child in node.children: + if child.type == 'member_expression': + left_child = child + elif child.type in ['function_expression', 'arrow_function']: + right_child = child + + if left_child and right_child: + # Extract method name from member expression + method_name = self._extract_member_expression_name(left_child) + if method_name: + # Use just the last part as function name for cleaner identification + clean_name = method_name.split('.')[-1] if '.' in method_name else method_name + # Register as function symbol + self._register_function_symbol(right_child, clean_name, file_path, scope_stack + method_name.split('.')[:-1]) + + def _handle_lexical_declaration(self, node, file_path: str, scope_stack: List[str]) -> None: + """Handle lexical declarations like const VAR = value""" + for child in node.children: + if child.type == 'variable_declarator': + # Get variable name and value + var_name = None + var_value = None + + for declarator_child in child.children: + if declarator_child.type == 'identifier': + var_name = self._get_node_text(declarator_child) + elif declarator_child.type in ['object_expression', 'new_expression', 'call_expression']: + var_value = declarator_child + elif declarator_child.type == 'object_pattern': + # Handle destructuring like const { v4: uuidv4 } = require('uuid') + self._handle_destructuring_pattern(declarator_child, file_path, scope_stack) + + if var_name: + # Check if this is an import/require statement + if var_value and var_value.type == 'call_expression': + # Check if it's a require() call + is_require = False + for cc in var_value.children: + if cc.type == 'identifier' and self._get_node_text(cc) == 'require': + is_require = True + break + + if is_require: + self._handle_import_statement(var_name, var_value, file_path, scope_stack) + else: + # Register as variable (like const limiter = rateLimit(...)) + self._register_variable_symbol(child, var_name, file_path, scope_stack, var_value) + + # Extract functions from call_expression (like rateLimit config) + self._extract_functions_from_call_expression(var_value, var_name, file_path, scope_stack) + else: + # Register as constant/variable symbol + self._register_variable_symbol(child, var_name, file_path, scope_stack, var_value) + # Extract functions from object expressions + if var_value and var_value.type == 'object_expression': + self._extract_functions_from_object(var_value, var_name, file_path, scope_stack) + + def _handle_expression_statement(self, node, file_path: str, scope_stack: List[str]) -> None: + """Handle expression statements that might contain method chains""" + for child in node.children: + if child.type == 'call_expression': + # Look for method chain patterns like schema.virtual().get() + self._handle_method_chain(child, file_path, scope_stack) + elif child.type == 'assignment_expression': + # Handle nested assignment expressions + self._handle_assignment_expression(child, file_path, scope_stack) + + def _handle_method_chain(self, node, file_path: str, scope_stack: List[str]) -> None: + """Handle method chains like schema.virtual('name').get(function() {})""" + # Look for chained calls that end with function expressions + for child in node.children: + if child.type == 'member_expression': + # This could be a chained method call + member_name = self._extract_member_expression_name(child) + if member_name: + # Look for function arguments + for sibling in node.children: + if sibling.type == 'arguments': + for arg in sibling.children: + if arg.type in ['function_expression', 'arrow_function']: + # Register the function with a descriptive name + func_name = f"{member_name}_callback" + self._register_function_symbol(arg, func_name, file_path, scope_stack) + + def _extract_member_expression_name(self, node) -> Optional[str]: + """Extract name from member expression like obj.prop.method""" + parts = [] + + def extract_parts(n): + if n.type == 'member_expression': + # Process children in order: object first, then property + object_child = None + property_child = None + + for child in n.children: + if child.type in ['identifier', 'member_expression']: + object_child = child + elif child.type == 'property_identifier': + property_child = child + + # Recursively extract object part first + if object_child: + if object_child.type == 'member_expression': + extract_parts(object_child) + elif object_child.type == 'identifier': + parts.append(self._get_node_text(object_child)) + + # Then add the property + if property_child: + parts.append(self._get_node_text(property_child)) + + elif n.type == 'identifier': + parts.append(self._get_node_text(n)) + + extract_parts(node) + return '.'.join(parts) if parts else None + + def _register_variable_symbol(self, node, name: str, file_path: str, scope_stack: List[str], value_node=None) -> None: + """Register a variable/constant symbol definition.""" + symbol_id = self._create_variable_symbol_id(name, file_path, scope_stack, value_node) + + # Determine symbol type based on value + symbol_kind = scip_pb2.Variable + doc_type = "JavaScript variable" + + if value_node: + if value_node.type == 'object_expression': + symbol_kind = scip_pb2.Object + doc_type = "JavaScript object" + elif value_node.type == 'new_expression': + symbol_kind = scip_pb2.Variable # new expressions create variables, not classes + doc_type = "JavaScript instance" + elif value_node.type == 'call_expression': + # Check if it's a require call vs regular function call + is_require = False + for child in value_node.children: + if child.type == 'identifier' and self._get_node_text(child) == 'require': + is_require = True + break + if is_require: + symbol_kind = scip_pb2.Namespace + doc_type = "JavaScript import" + else: + symbol_kind = scip_pb2.Variable + doc_type = "JavaScript constant" + + # Create a dummy range for registration + dummy_range = scip_pb2.Range() + dummy_range.start.extend([0, 0]) + dummy_range.end.extend([0, 1]) + + self.reference_resolver.register_symbol_definition( + symbol_id=symbol_id, + file_path=file_path, + definition_range=dummy_range, + symbol_kind=symbol_kind, + display_name=name, + documentation=[doc_type] + ) + + def _handle_destructuring_pattern(self, node, file_path: str, scope_stack: List[str]) -> None: + """Handle destructuring patterns like { v4: uuidv4 }""" + for child in node.children: + if child.type == 'shorthand_property_identifier_pattern': + # Simple destructuring like { prop } + var_name = self._get_node_text(child) + if var_name: + self._register_variable_symbol(child, var_name, file_path, scope_stack) + elif child.type == 'pair_pattern': + # Renamed destructuring like { v4: uuidv4 } + for pair_child in child.children: + if pair_child.type == 'identifier': + var_name = self._get_node_text(pair_child) + if var_name: + self._register_variable_symbol(pair_child, var_name, file_path, scope_stack) + + def _handle_import_statement(self, var_name: str, call_node, file_path: str, scope_stack: List[str]) -> None: + """Handle import statements like const lib = require('module')""" + # Check if this is a require() call + callee = None + module_name = None + + for child in call_node.children: + if child.type == 'identifier': + callee = self._get_node_text(child) + elif child.type == 'arguments': + # Get the module name from arguments + for arg in child.children: + if arg.type == 'string': + module_name = self._get_node_text(arg).strip('"\'') + break + + if callee == 'require' and module_name: + # Classify dependency type + self._classify_and_store_dependency(module_name) + + # Create SCIP standard symbol ID + local_id = ".".join(scope_stack + [var_name]) if scope_stack else var_name + symbol_id = f"local {local_id}(import)" + + dummy_range = scip_pb2.Range() + dummy_range.start.extend([0, 0]) + dummy_range.end.extend([0, 1]) + + self.reference_resolver.register_symbol_definition( + symbol_id=symbol_id, + file_path=file_path, + definition_range=dummy_range, + symbol_kind=scip_pb2.Namespace, + display_name=var_name, + documentation=[f"Import from {module_name}"] + ) + + def _handle_assignment_for_document(self, node, file_path: str, scope_stack: List[str], relationships: Optional[Dict[str, List[tuple]]]) -> tuple[Optional[scip_pb2.Occurrence], Optional[scip_pb2.SymbolInformation]]: + """Handle assignment expressions for document generation""" + left_child = None + right_child = None + + for child in node.children: + if child.type == 'member_expression': + left_child = child + elif child.type in ['function_expression', 'arrow_function']: + right_child = child + + if left_child and right_child: + method_name = self._extract_member_expression_name(left_child) + if method_name: + symbol_id = self._create_function_symbol_id(method_name, file_path, scope_stack) + occurrence = self._create_function_occurrence(right_child, symbol_id) + symbol_relationships = relationships.get(symbol_id, []) if relationships else [] + scip_relationships = self._create_scip_relationships(symbol_relationships) if symbol_relationships else [] + symbol_info = self._create_function_symbol_info(right_child, symbol_id, method_name, scip_relationships) + return occurrence, symbol_info + + return None, None + + def _handle_lexical_for_document(self, node, file_path: str, scope_stack: List[str], relationships: Optional[Dict[str, List[tuple]]]) -> List[tuple]: + """Handle lexical declarations for document generation""" + results = [] + + for child in node.children: + if child.type == 'variable_declarator': + var_name = None + var_value = None + + for declarator_child in child.children: + if declarator_child.type == 'identifier': + var_name = self._get_node_text(declarator_child) + elif declarator_child.type in ['object_expression', 'new_expression', 'call_expression']: + var_value = declarator_child + + if var_name: + # Create occurrence and symbol info for variable + symbol_id = self._create_variable_symbol_id(var_name, file_path, scope_stack, var_value) + occurrence = self._create_variable_occurrence(child, symbol_id) + symbol_info = self._create_variable_symbol_info(child, symbol_id, var_name, var_value) + results.append((occurrence, symbol_info)) + + return results + + def _create_variable_symbol_id(self, name: str, file_path: str, scope_stack: List[str], value_node=None) -> str: + """Create symbol ID for variable.""" + # SCIP standard: local + local_id = ".".join(scope_stack + [name]) if scope_stack else name + + # Determine descriptor based on value type + descriptor = "." # Default for variables + if value_node: + if value_node.type == 'object_expression': + descriptor = "{}" + elif value_node.type == 'new_expression': + descriptor = "." # new expressions are still variables, not classes + elif value_node.type == 'call_expression': + # Check if it's a require call vs regular function call + is_require = False + for child in value_node.children: + if child.type == 'identifier' and hasattr(self, '_get_node_text'): + if self._get_node_text(child) == 'require': + is_require = True + break + descriptor = "(import)" if is_require else "." + + return f"local {local_id}{descriptor}" + + def _create_variable_occurrence(self, node, symbol_id: str) -> Optional[scip_pb2.Occurrence]: + """Create SCIP occurrence for variable.""" + if not self.position_calculator: + return None + + try: + range_obj = self.position_calculator.tree_sitter_node_to_range(node) + occurrence = scip_pb2.Occurrence() + occurrence.symbol = symbol_id + occurrence.symbol_roles = scip_pb2.Definition + occurrence.syntax_kind = scip_pb2.IdentifierConstant + occurrence.range.CopyFrom(range_obj) + return occurrence + except: + return None + + def _create_variable_symbol_info(self, node, symbol_id: str, name: str, value_node=None) -> scip_pb2.SymbolInformation: + """Create SCIP symbol information for variable.""" + symbol_info = scip_pb2.SymbolInformation() + symbol_info.symbol = symbol_id + symbol_info.display_name = name + + # Determine kind based on value - correct classification + if value_node: + if value_node.type == 'object_expression': + symbol_info.kind = scip_pb2.Object + symbol_info.documentation.append("JavaScript object literal") + elif value_node.type == 'new_expression': + symbol_info.kind = scip_pb2.Variable # new expressions create variables, not classes + symbol_info.documentation.append("JavaScript instance variable") + elif value_node.type == 'call_expression': + symbol_info.kind = scip_pb2.Namespace + symbol_info.documentation.append("JavaScript import") + elif value_node.type == 'function_expression': + symbol_info.kind = scip_pb2.Function + symbol_info.documentation.append("JavaScript function variable") + else: + symbol_info.kind = scip_pb2.Variable + symbol_info.documentation.append("JavaScript variable") + else: + symbol_info.kind = scip_pb2.Variable + symbol_info.documentation.append("JavaScript variable") + return symbol_info + + def _extract_functions_from_object(self, object_node, parent_name: str, file_path: str, scope_stack: List[str]) -> None: + """Extract functions from object expressions like { handler: function() {} }""" + for child in object_node.children: + if child.type == 'pair': + prop_name = None + prop_value = None + + for pair_child in child.children: + if pair_child.type in ['identifier', 'property_identifier']: + prop_name = self._get_node_text(pair_child) + elif pair_child.type in ['function_expression', 'arrow_function']: + prop_value = pair_child + + if prop_name and prop_value: + # Register function with context-aware name + func_scope = scope_stack + [parent_name] + self._register_function_symbol(prop_value, prop_name, file_path, func_scope) + + def _extract_functions_from_call_expression(self, call_node, parent_name: str, file_path: str, scope_stack: List[str]) -> None: + """Extract functions from call expressions arguments like rateLimit({ handler: function() {} })""" + for child in call_node.children: + if child.type == 'arguments': + for arg in child.children: + if arg.type == 'object_expression': + self._extract_functions_from_object(arg, parent_name, file_path, scope_stack) + elif arg.type in ['function_expression', 'arrow_function']: + # Anonymous function in call - give it a descriptive name + func_name = f"{parent_name}_callback" + self._register_function_symbol(arg, func_name, file_path, scope_stack) + + def _classify_and_store_dependency(self, module_name: str) -> None: + """Classify and store dependency based on module name.""" + # Standard Node.js built-in modules + node_builtins = { + 'fs', 'path', 'http', 'https', 'url', 'crypto', 'os', 'util', 'events', + 'stream', 'buffer', 'child_process', 'cluster', 'dgram', 'dns', 'net', + 'tls', 'zlib', 'readline', 'repl', 'vm', 'worker_threads', 'async_hooks' + } + + if module_name in node_builtins: + category = 'standard_library' + elif module_name.startswith('./') or module_name.startswith('../') or module_name.startswith('/'): + category = 'local' + else: + category = 'third_party' + + # Avoid duplicates + if module_name not in self.dependencies['imports'][category]: + self.dependencies['imports'][category].append(module_name) + + def get_dependencies(self) -> Dict[str, Any]: + """Get collected dependencies for MCP response.""" + return self.dependencies + + def _reset_dependencies(self) -> None: + """Reset dependency tracking for new file analysis.""" + self.dependencies = { + 'imports': { + 'standard_library': [], + 'third_party': [], + 'local': [] + } + } \ No newline at end of file diff --git a/src/code_index_mcp/scip/strategies/objective_c_strategy.py b/src/code_index_mcp/scip/strategies/objective_c_strategy.py index d8bbf7d..c27dc87 100644 --- a/src/code_index_mcp/scip/strategies/objective_c_strategy.py +++ b/src/code_index_mcp/scip/strategies/objective_c_strategy.py @@ -1,51 +1,67 @@ -"""Objective-C SCIP indexing strategy - SCIP standard compliant.""" +""" +Objective-C Strategy for SCIP indexing using libclang. + +This strategy uses libclang to parse Objective-C source files (.m, .mm, .h) +and extract symbol information following SCIP standards. +""" import logging import os -import re -from typing import List, Optional, Dict, Any, Set +from typing import List, Set, Optional, Tuple, Dict, Any from pathlib import Path -import tree_sitter -from tree_sitter_c import language as c_language +try: + import clang.cindex as clang + from clang.cindex import CursorKind, TypeKind + LIBCLANG_AVAILABLE = True +except ImportError: + LIBCLANG_AVAILABLE = False + clang = None + CursorKind = None + TypeKind = None from .base_strategy import SCIPIndexerStrategy, StrategyError from ..proto import scip_pb2 from ..core.position_calculator import PositionCalculator from ..core.relationship_types import InternalRelationshipType - logger = logging.getLogger(__name__) class ObjectiveCStrategy(SCIPIndexerStrategy): - """SCIP-compliant Objective-C indexing strategy using Tree-sitter + regex patterns.""" - - SUPPORTED_EXTENSIONS = {'.m', '.mm'} - + """SCIP indexing strategy for Objective-C using libclang.""" + + SUPPORTED_EXTENSIONS = {'.m', '.mm', '.h'} + def __init__(self, priority: int = 95): """Initialize the Objective-C strategy.""" super().__init__(priority) + self._processed_symbols: Set[str] = set() + self._symbol_counter = 0 + self.project_path: Optional[str] = None - # Initialize C parser (handles Objective-C syntax reasonably well) - c_lang = tree_sitter.Language(c_language()) - self.parser = tree_sitter.Parser(c_lang) - def can_handle(self, extension: str, file_path: str) -> bool: """Check if this strategy can handle the file type.""" + if not LIBCLANG_AVAILABLE: + logger.warning("libclang not available for Objective-C processing") + return False return extension.lower() in self.SUPPORTED_EXTENSIONS - + def get_language_name(self) -> str: """Get the language name for SCIP symbol generation.""" return "objc" - + def is_available(self) -> bool: """Check if this strategy is available.""" - return True - + return LIBCLANG_AVAILABLE + def _collect_symbol_definitions(self, files: List[str], project_path: str) -> None: """Phase 1: Collect all symbol definitions from Objective-C files.""" logger.debug(f"ObjectiveCStrategy Phase 1: Processing {len(files)} files for symbol collection") + + # Store project path for use in import classification + self.project_path = project_path + processed_count = 0 error_count = 0 @@ -67,9 +83,9 @@ def _collect_symbol_definitions(self, files: List[str], project_path: str) -> No logger.info(f"Phase 1 summary: {processed_count} files processed, {error_count} errors") def _generate_documents_with_references(self, files: List[str], project_path: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> List[scip_pb2.Document]: - """Phase 2: Generate complete SCIP documents with resolved references.""" + """Phase 3: Generate complete SCIP documents with resolved references.""" documents = [] - logger.debug(f"ObjectiveCStrategy Phase 2: Generating documents for {len(files)} files") + logger.debug(f"ObjectiveCStrategy Phase 3: Generating documents for {len(files)} files") processed_count = 0 error_count = 0 total_occurrences = 0 @@ -87,89 +103,22 @@ def _generate_documents_with_references(self, files: List[str], project_path: st processed_count += 1 if i % 10 == 0 or i == len(files): - logger.debug(f"Phase 2 progress: {i}/{len(files)} files, " + logger.debug(f"Phase 3 progress: {i}/{len(files)} files, " f"last file: {relative_path}, " f"{len(document.occurrences) if document else 0} occurrences") except Exception as e: error_count += 1 - logger.error(f"Phase 2 failed for {relative_path}: {e}") + logger.error(f"Phase 3 failed for {relative_path}: {e}") continue - logger.info(f"Phase 2 summary: {processed_count} documents generated, {error_count} errors, " + logger.info(f"Phase 3 summary: {processed_count} documents generated, {error_count} errors, " f"{total_occurrences} total occurrences, {total_symbols} total symbols") return documents - def _collect_symbols_from_file(self, file_path: str, project_path: str) -> None: - """Collect symbol definitions from a single Objective-C file.""" - # Read file content - content = self._read_file_content(file_path) - if not content: - logger.debug(f"Empty file skipped: {os.path.relpath(file_path, project_path)}") - return - - # Parse with Tree-sitter - tree = self._parse_content(content) - if not tree: - logger.debug(f"Parse failed: {os.path.relpath(file_path, project_path)}") - return - - # Collect symbols using both Tree-sitter and regex - relative_path = self._get_relative_path(file_path, project_path) - self._collect_symbols_from_tree_and_regex(tree, relative_path, content) - logger.debug(f"Symbol collection - {relative_path}") - - def _analyze_objc_file(self, file_path: str, project_path: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> Optional[scip_pb2.Document]: - """Analyze a single Objective-C file and generate complete SCIP document.""" - # Read file content - content = self._read_file_content(file_path) - if not content: - return None - - # Parse with Tree-sitter - tree = self._parse_content(content) - if not tree: - return None - - # Create SCIP document - document = scip_pb2.Document() - document.relative_path = self._get_relative_path(file_path, project_path) - document.language = 'objc' if file_path.endswith('.m') else 'objcpp' - - # Analyze AST and generate occurrences - self.position_calculator = PositionCalculator(content) - occurrences, symbols = self._analyze_tree_and_regex_for_document(tree, document.relative_path, content, relationships) - - # Add results to document - document.occurrences.extend(occurrences) - document.symbols.extend(symbols) - - logger.debug(f"Analyzed Objective-C file {document.relative_path}: " - f"{len(document.occurrences)} occurrences, {len(document.symbols)} symbols") - - return document - - def _parse_content(self, content: str) -> Optional: - """Parse Objective-C content with tree-sitter C parser.""" - try: - content_bytes = content.encode('utf-8') - return self.parser.parse(content_bytes) - except Exception as e: - logger.error(f"Failed to parse Objective-C content: {e}") - return None - def _build_symbol_relationships(self, files: List[str], project_path: str) -> Dict[str, List[tuple]]: - """ - Build relationships between Objective-C symbols. - - Args: - files: List of file paths to process - project_path: Project root path - - Returns: - Dictionary mapping symbol_id -> [(target_symbol_id, relationship_type), ...] - """ + """Phase 2: Build relationships between Objective-C symbols.""" logger.debug(f"ObjectiveCStrategy: Building symbol relationships for {len(files)} files") all_relationships = {} @@ -185,699 +134,950 @@ def _build_symbol_relationships(self, files: List[str], project_path: str) -> Di logger.debug(f"ObjectiveCStrategy: Built {total_relationships} relationships for {total_symbols_with_relationships} symbols") return all_relationships - - def _extract_relationships_from_file(self, file_path: str, project_path: str) -> Dict[str, List[tuple]]: - """Extract relationships from a single Objective-C file.""" + + def _collect_symbols_from_file(self, file_path: str, project_path: str) -> None: + """Collect symbol definitions from a single Objective-C file using libclang.""" content = self._read_file_content(file_path) if not content: - return {} - - relationships = {} - relative_path = self._get_relative_path(file_path, project_path) - - # Class inheritance patterns - interface_pattern = r"@interface\s+(\w+)\s*:\s*(\w+)" - for match in re.finditer(interface_pattern, content): - child_class = match.group(1) - parent_class = match.group(2) - - child_symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=relative_path, - symbol_path=[child_class], - descriptor="#" - ) - parent_symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=relative_path, - symbol_path=[parent_class], - descriptor="#" + logger.debug(f"Empty file skipped: {os.path.relpath(file_path, project_path)}") + return + + try: + # Parse with libclang + index = clang.Index.create() + translation_unit = index.parse( + file_path, + args=['-ObjC', '-x', 'objective-c'], + options=clang.TranslationUnit.PARSE_DETAILED_PROCESSING_RECORD ) - if child_symbol_id not in relationships: - relationships[child_symbol_id] = [] - relationships[child_symbol_id].append((parent_symbol_id, InternalRelationshipType.INHERITS)) - - # Protocol adoption patterns - protocol_pattern = r"@interface\s+(\w+).*?<(.+?)>" - for match in re.finditer(protocol_pattern, content, re.DOTALL): - class_name = match.group(1) - protocols = [p.strip() for p in match.group(2).split(",")] + if not translation_unit: + logger.debug(f"Parse failed: {os.path.relpath(file_path, project_path)}") + return - class_symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=relative_path, - symbol_path=[class_name], - descriptor="#" - ) + # Reset processed symbols for each file + self._processed_symbols.clear() + self._symbol_counter = 0 + + # Traverse AST to collect symbols + relative_path = self._get_relative_path(file_path, project_path) + self._traverse_clang_ast_for_symbols(translation_unit.cursor, relative_path, content, file_path) + + # Extract imports/dependencies and register with symbol manager + self._extract_and_register_imports(translation_unit.cursor, file_path, project_path) + + logger.debug(f"Symbol collection completed - {relative_path}") + + except Exception as e: + logger.error(f"Error processing {file_path} with libclang: {e}") + + def _extract_and_register_imports(self, cursor: 'clang.Cursor', file_path: str, project_path: str) -> None: + """Extract imports from AST and register them with the symbol manager.""" + try: + # Traverse AST to find all import statements + self._traverse_ast_for_import_registration(cursor, file_path, project_path) + + except Exception as e: + logger.error(f"Error extracting imports from {file_path}: {e}") + + def _traverse_ast_for_import_registration(self, cursor: 'clang.Cursor', file_path: str, project_path: str) -> None: + """Traverse AST specifically to register imports with the symbol manager.""" + try: + # Process current cursor for import registration + if cursor.kind == CursorKind.INCLUSION_DIRECTIVE: + self._register_import_with_symbol_manager(cursor, file_path, project_path) - for protocol in protocols: - if protocol and protocol.replace(" ", "").isidentifier(): - protocol_symbol_id = self.symbol_manager.create_local_symbol( + # Recursively process children + for child in cursor.get_children(): + self._traverse_ast_for_import_registration(child, file_path, project_path) + + except Exception as e: + logger.error(f"Error traversing AST for import registration: {e}") + + def _register_import_with_symbol_manager(self, cursor: 'clang.Cursor', file_path: str, project_path: str) -> None: + """Register a single import with the symbol manager.""" + try: + # Try to get the included file path + include_path = None + framework_name = None + + # Method 1: Try to get the included file (may fail for system headers) + try: + included_file = cursor.get_included_file() + if included_file: + include_path = str(included_file) + logger.debug(f"Got include path from file: {include_path}") + except Exception as e: + logger.debug(f"Failed to get included file: {e}") + + # Method 2: Try to get from cursor spelling (the actual #import statement) + spelling = cursor.spelling + if spelling: + logger.debug(f"Got cursor spelling: {spelling}") + # Extract framework name from spelling like "Foundation/Foundation.h" or "Person.h" + framework_name = self._extract_framework_name_from_spelling(spelling) + if framework_name: + logger.debug(f"Extracted framework name from spelling: {framework_name}") + + # Classify based on spelling pattern + import_type = self._classify_import_from_spelling(spelling) + logger.debug(f"Classified import as: {import_type}") + + # Only register external dependencies (not local files) + if import_type in ['standard_library', 'third_party']: + if not self.symbol_manager: + logger.error("Symbol manager is None!") + return + + # Determine version if possible (for now, leave empty) + version = "" + + logger.debug(f"Registering external symbol: {framework_name}") + + # Register the import with the moniker manager + symbol_id = self.symbol_manager.create_external_symbol( + language="objc", + package_name=framework_name, + module_path=framework_name, + symbol_name="*", # Framework-level import + version=version, + alias=None + ) + + logger.debug(f"Registered external dependency: {framework_name} ({import_type}) -> {symbol_id}") + return + else: + logger.debug(f"Skipping local import: {framework_name} ({import_type})") + return + + # Method 3: Fallback to include_path if we have it + if include_path: + logger.debug(f"Processing include path: {include_path}") + + # Extract framework/module name + framework_name = self._extract_framework_name(include_path, cursor) + if not framework_name: + logger.debug(f"No framework name extracted from {include_path}") + return + + logger.debug(f"Extracted framework name: {framework_name}") + + # Classify the import type + import_type = self._classify_objc_import(include_path) + logger.debug(f"Classified import as: {import_type}") + + # Only register external dependencies (not local files) + if import_type in ['standard_library', 'third_party']: + if not self.symbol_manager: + logger.error("Symbol manager is None!") + return + + # Determine version if possible (for now, leave empty) + version = self._extract_framework_version(include_path) + + logger.debug(f"Registering external symbol: {framework_name}") + + # Register the import with the moniker manager + symbol_id = self.symbol_manager.create_external_symbol( language="objc", - file_path=relative_path, - symbol_path=[protocol.strip()], - descriptor="#" + package_name=framework_name, + module_path=framework_name, + symbol_name="*", # Framework-level import + version=version, + alias=None ) - if class_symbol_id not in relationships: - relationships[class_symbol_id] = [] - relationships[class_symbol_id].append((protocol_symbol_id, InternalRelationshipType.IMPLEMENTS)) - - logger.debug(f"Extracted {len(relationships)} relationships from {relative_path}") - return relationships - - # Symbol collection methods - def _collect_symbols_from_tree_and_regex(self, tree, file_path: str, content: str) -> None: - """Collect symbols using both Tree-sitter and regex patterns.""" - scope_stack = [] - lines = content.split('\n') - - # First, use Tree-sitter for C-like constructs - self._collect_symbols_from_tree_sitter(tree.root_node, file_path, scope_stack, content) - - # Then, use regex for Objective-C specific constructs - self._collect_symbols_from_regex_patterns(file_path, lines, scope_stack) - - def _collect_symbols_from_tree_sitter(self, node, file_path: str, scope_stack: List[str], content: str): - """Collect symbols from Tree-sitter AST for C-like constructs.""" - node_type = node.type - - if node_type == 'function_definition': - self._register_c_function_symbol(node, file_path, scope_stack, content) - elif node_type == 'struct_specifier': - self._register_struct_symbol(node, file_path, scope_stack, content) - elif node_type == 'enum_specifier': - self._register_enum_symbol(node, file_path, scope_stack, content) - elif node_type == 'typedef_declaration': - self._register_typedef_symbol(node, file_path, scope_stack, content) - - # Recursively analyze child nodes - for child in node.children: - self._collect_symbols_from_tree_sitter(child, file_path, scope_stack, content) - - def _collect_symbols_from_regex_patterns(self, file_path: str, lines: List[str], scope_stack: List[str]): - """Collect Objective-C specific symbols using regex patterns.""" - patterns = { - 'interface': re.compile(r'@interface\s+(\w+)(?:\s*:\s*(\w+))?', re.MULTILINE), - 'implementation': re.compile(r'@implementation\s+(\w+)', re.MULTILINE), - 'protocol': re.compile(r'@protocol\s+(\w+)', re.MULTILINE), - 'property': re.compile(r'@property[^;]*?\s+(\w+)\s*;', re.MULTILINE), - 'instance_method': re.compile(r'^[-]\s*\([^)]*\)\s*(\w+)', re.MULTILINE), - 'class_method': re.compile(r'^[+]\s*\([^)]*\)\s*(\w+)', re.MULTILINE), - 'category': re.compile(r'@interface\s+(\w+)\s*\(\s*(\w*)\s*\)', re.MULTILINE), - } - - for line_num, line in enumerate(lines): - line = line.strip() - - # @interface declarations - match = patterns['interface'].match(line) - if match: - class_name = match.group(1) - self._register_objc_class_symbol(class_name, file_path, scope_stack, "Objective-C interface") - continue + logger.debug(f"Registered external dependency: {framework_name} ({import_type}) -> {symbol_id}") + else: + logger.debug(f"Skipping local import: {framework_name} ({import_type})") + else: + logger.debug("No include path or spelling found for cursor") + + except Exception as e: + logger.error(f"Error registering import with symbol manager: {e}") + import traceback + logger.error(f"Traceback: {traceback.format_exc()}") - # @implementation - match = patterns['implementation'].match(line) - if match: - class_name = match.group(1) - self._register_objc_class_symbol(class_name, file_path, scope_stack, "Objective-C implementation") - continue + def _extract_framework_name_from_spelling(self, spelling: str) -> Optional[str]: + """Extract framework name from cursor spelling.""" + try: + # Remove quotes and angle brackets + clean_spelling = spelling.strip('"<>') + + # For framework imports like "Foundation/Foundation.h" + if '/' in clean_spelling: + parts = clean_spelling.split('/') + if len(parts) >= 2: + framework_name = parts[0] + return framework_name + + # For simple includes like "MyHeader.h" + header_name = clean_spelling.replace('.h', '').replace('.m', '').replace('.mm', '') + return header_name + + except Exception as e: + logger.debug(f"Error extracting framework name from spelling {spelling}: {e}") + return None - # @protocol - match = patterns['protocol'].match(line) - if match: - protocol_name = match.group(1) - self._register_objc_protocol_symbol(protocol_name, file_path, scope_stack) - continue + def _classify_import_from_spelling(self, spelling: str) -> str: + """Classify import based on spelling pattern.""" + try: + # Remove quotes and angle brackets + clean_spelling = spelling.strip('"<>') + + # Check if it's a known system framework by name (since cursor.spelling doesn't include brackets) + if '/' in clean_spelling: + framework_name = clean_spelling.split('/')[0] + system_frameworks = { + 'Foundation', 'UIKit', 'CoreData', 'CoreGraphics', 'QuartzCore', + 'AVFoundation', 'CoreLocation', 'MapKit', 'CoreAnimation', + 'Security', 'SystemConfiguration', 'CFNetwork', 'CoreFoundation', + 'AppKit', 'Cocoa', 'WebKit', 'JavaScriptCore', 'Metal', 'MetalKit', + 'GameplayKit', 'SpriteKit', 'SceneKit', 'ARKit', 'Vision', 'CoreML' + } + if framework_name in system_frameworks: + return 'standard_library' + + # Check for single framework names (like just "Foundation.h") + framework_name_only = clean_spelling.replace('.h', '').replace('.framework', '') + system_frameworks = { + 'Foundation', 'UIKit', 'CoreData', 'CoreGraphics', 'QuartzCore', + 'AVFoundation', 'CoreLocation', 'MapKit', 'CoreAnimation', + 'Security', 'SystemConfiguration', 'CFNetwork', 'CoreFoundation', + 'AppKit', 'Cocoa', 'WebKit', 'JavaScriptCore', 'Metal', 'MetalKit', + 'GameplayKit', 'SpriteKit', 'SceneKit', 'ARKit', 'Vision', 'CoreML' + } + if framework_name_only in system_frameworks: + return 'standard_library' + + # Angle brackets indicate system headers (if we had them) + if spelling.startswith('<') and spelling.endswith('>'): + return 'standard_library' + + # Quotes indicate local or third-party headers + elif spelling.startswith('"') and spelling.endswith('"'): + # Check for common third-party patterns + if any(pattern in clean_spelling.lower() for pattern in ['pods/', 'carthage/', 'node_modules/']): + return 'third_party' + + # Default for quoted imports + return 'local' + + # Check for common third-party patterns in the path + if any(pattern in clean_spelling.lower() for pattern in ['pods/', 'carthage/', 'node_modules/']): + return 'third_party' + + # Check if it looks like a local header (simple filename) + if '/' not in clean_spelling and clean_spelling.endswith('.h'): + return 'local' + + # Fallback: if it contains system-like paths, classify as standard_library + if any(pattern in clean_spelling.lower() for pattern in ['/system/', '/usr/', '/applications/xcode']): + return 'standard_library' + + # Default fallback + return 'local' + + except Exception as e: + logger.debug(f"Error classifying import from spelling {spelling}: {e}") + return 'local' - # @property - match = patterns['property'].search(line) - if match: - property_name = match.group(1) - self._register_objc_property_symbol(property_name, file_path, scope_stack) - continue + def _extract_framework_version(self, include_path: str) -> str: + """Extract framework version from include path if available.""" + # For now, return empty string. Could be enhanced to detect versions + # from CocoaPods Podfile.lock, Carthage, or other dependency managers + return "" - # Instance methods - match = patterns['instance_method'].match(line) - if match: - method_name = match.group(1) - self._register_objc_method_symbol(method_name, False, file_path, scope_stack) - continue + def _analyze_objc_file(self, file_path: str, project_path: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> Optional[scip_pb2.Document]: + """Analyze a single Objective-C file and generate complete SCIP document.""" + content = self._read_file_content(file_path) + if not content: + return None - # Class methods - match = patterns['class_method'].match(line) - if match: - method_name = match.group(1) - self._register_objc_method_symbol(method_name, True, file_path, scope_stack) - continue + try: + # Parse with libclang + index = clang.Index.create() + translation_unit = index.parse( + file_path, + args=['-ObjC', '-x', 'objective-c'], + options=clang.TranslationUnit.PARSE_DETAILED_PROCESSING_RECORD + ) + + if not translation_unit: + return None - # Document analysis methods - def _analyze_tree_and_regex_for_document(self, tree, file_path: str, content: str) -> tuple: - """Analyze using both Tree-sitter and regex patterns to generate SCIP data.""" - occurrences = [] - symbols = [] - scope_stack = [] - lines = content.split('\n') - - # First, use Tree-sitter for C-like constructs - tree_occs, tree_syms = self._analyze_tree_sitter_for_document(tree.root_node, file_path, scope_stack, content) - occurrences.extend(tree_occs) - symbols.extend(tree_syms) - - # Then, use regex for Objective-C specific constructs - regex_occs, regex_syms = self._analyze_regex_patterns_for_document(file_path, lines, scope_stack) - occurrences.extend(regex_occs) - symbols.extend(regex_syms) - - return occurrences, symbols + # Create SCIP document + document = scip_pb2.Document() + document.relative_path = self._get_relative_path(file_path, project_path) + document.language = self._get_document_language(file_path) - def _analyze_tree_sitter_for_document(self, node, file_path: str, scope_stack: List[str], content: str) -> tuple: - """Analyze Tree-sitter nodes for C-like constructs and generate SCIP data.""" - occurrences = [] - symbols = [] - - node_type = node.type - - if node_type == 'function_definition': - occ, sym = self._process_c_function_for_document(node, file_path, scope_stack, content) - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - elif node_type == 'struct_specifier': - occ, sym = self._process_struct_for_document(node, file_path, scope_stack, content) - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - elif node_type == 'enum_specifier': - occ, sym = self._process_enum_for_document(node, file_path, scope_stack, content) - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - elif node_type == 'typedef_declaration': - occ, sym = self._process_typedef_for_document(node, file_path, scope_stack, content) - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - elif node_type == 'identifier': - occ = self._process_identifier_reference_for_document(node, file_path, scope_stack, content) - if occ: occurrences.append(occ) - - # Recursively analyze child nodes - for child in node.children: - child_occs, child_syms = self._analyze_tree_sitter_for_document(child, file_path, scope_stack, content) - occurrences.extend(child_occs) - symbols.extend(child_syms) - - return occurrences, symbols + # Initialize position calculator + self.position_calculator = PositionCalculator(content) + + # Reset processed symbols for each file + self._processed_symbols.clear() + self._symbol_counter = 0 + + # Generate occurrences and symbols + occurrences = [] + symbols = [] + + # Traverse AST for document generation + self._traverse_clang_ast_for_document(translation_unit.cursor, content, occurrences, symbols, relationships) - def _analyze_regex_patterns_for_document(self, file_path: str, lines: List[str], scope_stack: List[str]) -> tuple: - """Analyze Objective-C specific patterns using regex for SCIP document generation.""" - occurrences = [] - symbols = [] - - patterns = { - 'interface': re.compile(r'@interface\s+(\w+)(?:\s*:\s*(\w+))?', re.MULTILINE), - 'implementation': re.compile(r'@implementation\s+(\w+)', re.MULTILINE), - 'protocol': re.compile(r'@protocol\s+(\w+)', re.MULTILINE), - 'property': re.compile(r'@property[^;]*?\s+(\w+)\s*;', re.MULTILINE), - 'instance_method': re.compile(r'^[-]\s*\([^)]*\)\s*(\w+)', re.MULTILINE), - 'class_method': re.compile(r'^[+]\s*\([^)]*\)\s*(\w+)', re.MULTILINE), - } - - for line_num, line in enumerate(lines): - line = line.strip() - - # @interface declarations - match = patterns['interface'].match(line) - if match: - class_name = match.group(1) - occ, sym = self._create_objc_class_symbol_for_document(line_num, class_name, file_path, scope_stack, "Objective-C interface") - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - continue + # Add results to document + document.occurrences.extend(occurrences) + document.symbols.extend(symbols) - # @implementation - match = patterns['implementation'].match(line) - if match: - class_name = match.group(1) - occ, sym = self._create_objc_class_symbol_for_document(line_num, class_name, file_path, scope_stack, "Objective-C implementation") - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - continue + logger.debug(f"Analyzed Objective-C file {document.relative_path}: " + f"{len(document.occurrences)} occurrences, {len(document.symbols)} symbols") - # @protocol - match = patterns['protocol'].match(line) - if match: - protocol_name = match.group(1) - occ, sym = self._create_objc_protocol_symbol_for_document(line_num, protocol_name, file_path, scope_stack) - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - continue + return document + + except Exception as e: + logger.error(f"Error analyzing {file_path} with libclang: {e}") + return None - # @property - match = patterns['property'].search(line) - if match: - property_name = match.group(1) - occ, sym = self._create_objc_property_symbol_for_document(line_num, property_name, file_path, scope_stack) - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - continue + def _traverse_clang_ast_for_symbols(self, cursor: 'clang.Cursor', file_path: str, content: str, full_file_path: str) -> None: + """Traverse libclang AST for symbol definitions (Phase 1).""" + try: + # Process current cursor + self._process_cursor_for_symbols(cursor, file_path, content, full_file_path) + + # Recursively process children + for child in cursor.get_children(): + self._traverse_clang_ast_for_symbols(child, file_path, content, full_file_path) + + except Exception as e: + logger.error(f"Error traversing AST for symbols: {e}") - # Instance methods - match = patterns['instance_method'].match(line) - if match: - method_name = match.group(1) - occ, sym = self._create_objc_method_symbol_for_document(line_num, method_name, False, file_path, scope_stack) - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - continue + def _traverse_clang_ast_for_imports(self, cursor: 'clang.Cursor', file_path: str, imports: 'ImportGroup') -> None: + """Traverse libclang AST specifically for import/include statements.""" + try: + # Process current cursor for imports + self._process_cursor_for_imports(cursor, file_path, imports) + + # Recursively process children + for child in cursor.get_children(): + self._traverse_clang_ast_for_imports(child, file_path, imports) + + except Exception as e: + logger.error(f"Error traversing AST for imports: {e}") - # Class methods - match = patterns['class_method'].match(line) - if match: - method_name = match.group(1) - occ, sym = self._create_objc_method_symbol_for_document(line_num, method_name, True, file_path, scope_stack) - if occ: occurrences.append(occ) - if sym: symbols.append(sym) - continue - - return occurrences, symbols - - # Symbol registration methods (Phase 1) - def _register_c_function_symbol(self, node, file_path: str, scope_stack: List[str], content: str): - """Register a C function symbol.""" - declarator = self._find_child_by_type(node, 'function_declarator') - if declarator: - name_node = self._find_child_by_type(declarator, 'identifier') - if name_node: - name = self._get_node_text(name_node, content) - if name: - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="()." - ) - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) - self.reference_resolver.register_symbol_definition( - symbol_id=symbol_id, - file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.Function, - display_name=name, - documentation=["C function"] - ) + def _traverse_clang_ast_for_document(self, cursor: 'clang.Cursor', content: str, occurrences: List, symbols: List, relationships: Optional[Dict[str, List[tuple]]] = None) -> None: + """Traverse libclang AST for document generation (Phase 3).""" + try: + # Process current cursor + self._process_cursor_for_document(cursor, content, occurrences, symbols, relationships) + + # Recursively process children + for child in cursor.get_children(): + self._traverse_clang_ast_for_document(child, content, occurrences, symbols, relationships) + + except Exception as e: + logger.error(f"Error traversing AST for document: {e}") - def _register_struct_symbol(self, node, file_path: str, scope_stack: List[str], content: str): - """Register a struct symbol.""" - name_node = self._find_child_by_type(node, 'type_identifier') - if name_node: - name = self._get_node_text(name_node, content) - if name: - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) + def _process_cursor_for_symbols(self, cursor: 'clang.Cursor', file_path: str, content: str, full_file_path: str) -> None: + """Process a cursor for symbol registration (Phase 1).""" + try: + # Skip invalid cursors or those outside our file + if not cursor.location.file or cursor.spelling == "": + return + + # Check if cursor is in the file we're processing + cursor_file = str(cursor.location.file) + if not cursor_file.endswith(os.path.basename(full_file_path)): + return + + cursor_kind = cursor.kind + symbol_name = cursor.spelling + + # Map libclang cursor kinds to SCIP symbols + symbol_info = self._map_cursor_to_symbol(cursor, symbol_name) + if not symbol_info: + return + + symbol_id, symbol_kind, symbol_roles = symbol_info + + # Avoid duplicates + duplicate_key = f"{symbol_id}:{cursor.location.line}:{cursor.location.column}" + if duplicate_key in self._processed_symbols: + return + self._processed_symbols.add(duplicate_key) + + # Calculate position + location = cursor.location + if location.line is not None and location.column is not None: + # libclang uses 1-based indexing, convert to 0-based + line = location.line - 1 + column = location.column - 1 + + # Calculate end position (approximate) + end_line = line + end_column = column + len(symbol_name) + + # Register symbol with reference resolver + if self.position_calculator: + range_obj = self.position_calculator.line_col_to_range(line, column, end_line, end_column) + else: + # Create a simple range object if position_calculator is not available + from ..proto.scip_pb2 import Range + range_obj = Range() + range_obj.start.extend([line, column]) + range_obj.end.extend([end_line, end_column]) self.reference_resolver.register_symbol_definition( symbol_id=symbol_id, file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.Struct, - display_name=name, - documentation=["C struct"] + definition_range=range_obj, + symbol_kind=symbol_kind, + display_name=symbol_name, + documentation=[f"Objective-C {cursor_kind.name}"] ) + + logger.debug(f"Registered Objective-C symbol: {symbol_name} ({cursor_kind.name}) at {line}:{column}") + + except Exception as e: + logger.error(f"Error processing cursor for symbols {cursor.spelling}: {e}") - def _register_enum_symbol(self, node, file_path: str, scope_stack: List[str], content: str): - """Register an enum symbol.""" - name_node = self._find_child_by_type(node, 'type_identifier') - if name_node: - name = self._get_node_text(name_node, content) - if name: - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) - self.reference_resolver.register_symbol_definition( - symbol_id=symbol_id, - file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.Enum, - display_name=name, - documentation=["C enum"] - ) + def _process_cursor_for_document(self, cursor: 'clang.Cursor', content: str, occurrences: List, symbols: List, relationships: Optional[Dict[str, List[tuple]]] = None) -> None: + """Process a cursor for document generation (Phase 3).""" + try: + # Skip invalid cursors or those outside our file + if not cursor.location.file or cursor.spelling == "": + return + + cursor_kind = cursor.kind + symbol_name = cursor.spelling + + # Map libclang cursor kinds to SCIP symbols + symbol_info = self._map_cursor_to_symbol(cursor, symbol_name) + if not symbol_info: + return + + symbol_id, symbol_kind, symbol_roles = symbol_info + + # Avoid duplicates + duplicate_key = f"{symbol_id}:{cursor.location.line}:{cursor.location.column}" + if duplicate_key in self._processed_symbols: + return + self._processed_symbols.add(duplicate_key) + + # Calculate position + location = cursor.location + if location.line is not None and location.column is not None: + # libclang uses 1-based indexing, convert to 0-based + line = location.line - 1 + column = location.column - 1 + + # Calculate end position (approximate) + end_line = line + end_column = column + len(symbol_name) + + # Create SCIP occurrence + occurrence = self._create_occurrence(symbol_id, line, column, end_line, end_column, symbol_roles) + if occurrence: + occurrences.append(occurrence) + + # Get relationships for this symbol + symbol_relationships = relationships.get(symbol_id, []) if relationships else [] + scip_relationships = self._create_scip_relationships(symbol_relationships) if symbol_relationships else [] + + # Create SCIP symbol information with relationships + symbol_info_obj = self._create_symbol_information_with_relationships(symbol_id, symbol_name, symbol_kind, scip_relationships) + if symbol_info_obj: + symbols.append(symbol_info_obj) + + logger.debug(f"Added Objective-C symbol: {symbol_name} ({cursor_kind.name}) at {line}:{column} with {len(scip_relationships)} relationships") + + except Exception as e: + logger.error(f"Error processing cursor for document {cursor.spelling}: {e}") - def _register_typedef_symbol(self, node, file_path: str, scope_stack: List[str], content: str): - """Register a typedef symbol.""" - name_node = self._find_child_by_type(node, 'type_identifier') - if name_node: - name = self._get_node_text(name_node, content) - if name: - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) - self.reference_resolver.register_symbol_definition( - symbol_id=symbol_id, - file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.TypeParameter, - display_name=name, - documentation=["C typedef"] - ) + def _process_cursor_for_imports(self, cursor: 'clang.Cursor', file_path: str, imports: 'ImportGroup') -> None: + """Process a cursor for import/include statements.""" + try: + # Skip invalid cursors or those outside our file + if not cursor.location.file: + return - def _register_objc_class_symbol(self, name: str, file_path: str, scope_stack: List[str], description: str): - """Register an Objective-C class/interface symbol.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) - self.reference_resolver.register_symbol_definition( - symbol_id=symbol_id, - file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.Class, - display_name=name, - documentation=[description] - ) - - def _register_objc_protocol_symbol(self, name: str, file_path: str, scope_stack: List[str]): - """Register an Objective-C protocol symbol.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) - self.reference_resolver.register_symbol_definition( - symbol_id=symbol_id, - file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.Interface, - display_name=name, - documentation=["Objective-C protocol"] - ) - - def _register_objc_property_symbol(self, name: str, file_path: str, scope_stack: List[str]): - """Register an Objective-C property symbol.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="" - ) - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) - self.reference_resolver.register_symbol_definition( - symbol_id=symbol_id, - file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.Property, - display_name=name, - documentation=["Objective-C property"] - ) - - def _register_objc_method_symbol(self, name: str, is_class_method: bool, file_path: str, scope_stack: List[str]): - """Register an Objective-C method symbol.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="()." - ) - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) - self.reference_resolver.register_symbol_definition( - symbol_id=symbol_id, - file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.Method, - display_name=name, - documentation=[f"Objective-C {'Class' if is_class_method else 'Instance'} method"] - ) - - # Document processing methods (Phase 2) - def _process_c_function_for_document(self, node, file_path: str, scope_stack: List[str], content: str) -> tuple: - """Process C function for SCIP document generation.""" - declarator = self._find_child_by_type(node, 'function_declarator') - if declarator: - name_node = self._find_child_by_type(declarator, 'identifier') - if name_node: - name = self._get_node_text(name_node, content) - if name: - return self._create_function_symbol_for_document(node, name_node, name, file_path, scope_stack, "C function") - return None, None - - def _process_struct_for_document(self, node, file_path: str, scope_stack: List[str], content: str) -> tuple: - """Process struct for SCIP document generation.""" - name_node = self._find_child_by_type(node, 'type_identifier') - if name_node: - name = self._get_node_text(name_node, content) - if name: - return self._create_type_symbol_for_document(node, name_node, name, scip_pb2.Struct, file_path, scope_stack, "C struct") - return None, None - - def _process_enum_for_document(self, node, file_path: str, scope_stack: List[str], content: str) -> tuple: - """Process enum for SCIP document generation.""" - name_node = self._find_child_by_type(node, 'type_identifier') - if name_node: - name = self._get_node_text(name_node, content) - if name: - return self._create_type_symbol_for_document(node, name_node, name, scip_pb2.Enum, file_path, scope_stack, "C enum") - return None, None - - def _process_typedef_for_document(self, node, file_path: str, scope_stack: List[str], content: str) -> tuple: - """Process typedef for SCIP document generation.""" - name_node = self._find_child_by_type(node, 'type_identifier') - if name_node: - name = self._get_node_text(name_node, content) - if name: - return self._create_type_symbol_for_document(node, name_node, name, scip_pb2.TypeParameter, file_path, scope_stack, "C typedef") - return None, None - - def _process_identifier_reference_for_document(self, node, file_path: str, scope_stack: List[str], content: str) -> Optional[scip_pb2.Occurrence]: - """Process identifier reference for SCIP document generation.""" - # Only handle if it's not part of a declaration - parent = node.parent - if parent and parent.type not in [ - 'function_definition', 'struct_specifier', 'enum_specifier', 'typedef_declaration' - ]: - name = self._get_node_text(node, content) - if name and len(name) > 1: # Avoid single letters - # Try to resolve the reference - symbol_id = self.reference_resolver.resolve_reference(name, file_path) - if symbol_id: - range_obj = self.position_calculator.tree_sitter_node_to_range(node) - return self._create_occurrence( - symbol_id, range_obj, 0, scip_pb2.Identifier # 0 = reference role - ) - return None - - # Symbol creation helpers for documents - def _create_function_symbol_for_document(self, node, name_node, name: str, file_path: str, scope_stack: List[str], description: str) -> tuple: - """Create a function symbol for SCIP document.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="()." - ) - - # Create definition occurrence - range_obj = self.position_calculator.tree_sitter_node_to_range(name_node) - occurrence = self._create_occurrence( - symbol_id, range_obj, scip_pb2.Definition, scip_pb2.IdentifierFunction - ) - - # Create symbol information - symbol_info = self._create_symbol_information( - symbol_id, name, scip_pb2.Function, [description] - ) - - return occurrence, symbol_info - - def _create_type_symbol_for_document(self, node, name_node, name: str, symbol_kind: int, file_path: str, scope_stack: List[str], description: str) -> tuple: - """Create a type symbol for SCIP document.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - - # Create definition occurrence - range_obj = self.position_calculator.tree_sitter_node_to_range(name_node) - occurrence = self._create_occurrence( - symbol_id, range_obj, scip_pb2.Definition, scip_pb2.IdentifierType - ) - - # Create symbol information - symbol_info = self._create_symbol_information( - symbol_id, name, symbol_kind, [description] - ) - - return occurrence, symbol_info - - def _create_objc_class_symbol_for_document(self, line_num: int, name: str, file_path: str, scope_stack: List[str], description: str) -> tuple: - """Create an Objective-C class/interface symbol for SCIP document.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - - # Create definition occurrence from line position - start_col, end_col = self.position_calculator.find_name_in_line(line_num, name) - range_obj = self.position_calculator.line_col_to_range( - line_num, start_col, line_num, end_col - ) - - occurrence = self._create_occurrence( - symbol_id, range_obj, scip_pb2.Definition, scip_pb2.IdentifierType - ) - - # Create symbol information - symbol_info = self._create_symbol_information( - symbol_id, name, scip_pb2.Class, [description] - ) - - return occurrence, symbol_info - - def _create_objc_protocol_symbol_for_document(self, line_num: int, name: str, file_path: str, scope_stack: List[str]) -> tuple: - """Create an Objective-C protocol symbol for SCIP document.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - - # Create definition occurrence from line position - start_col, end_col = self.position_calculator.find_name_in_line(line_num, name) - range_obj = self.position_calculator.line_col_to_range( - line_num, start_col, line_num, end_col - ) - - occurrence = self._create_occurrence( - symbol_id, range_obj, scip_pb2.Definition, scip_pb2.IdentifierType - ) - - # Create symbol information - symbol_info = self._create_symbol_information( - symbol_id, name, scip_pb2.Interface, ["Objective-C protocol"] - ) - - return occurrence, symbol_info - - def _create_objc_property_symbol_for_document(self, line_num: int, name: str, file_path: str, scope_stack: List[str]) -> tuple: - """Create an Objective-C property symbol for SCIP document.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="" - ) - - # Create definition occurrence from line position - start_col, end_col = self.position_calculator.find_name_in_line(line_num, name) - range_obj = self.position_calculator.line_col_to_range( - line_num, start_col, line_num, end_col - ) - - occurrence = self._create_occurrence( - symbol_id, range_obj, scip_pb2.Definition, scip_pb2.IdentifierLocal - ) + cursor_kind = cursor.kind + + # Process inclusion directives (#import, #include, @import) + if cursor_kind == CursorKind.INCLUSION_DIRECTIVE: + self._process_inclusion_directive(cursor, file_path, imports) + + except Exception as e: + logger.error(f"Error processing cursor for imports: {e}") + + def _process_inclusion_directive(self, cursor: 'clang.Cursor', file_path: str, imports: 'ImportGroup') -> None: + """Process a single #import/#include/@import directive.""" + try: + # Get the included file + included_file = cursor.get_included_file() + if not included_file: + return + + include_path = str(included_file) + + # Extract framework/module name + framework_name = self._extract_framework_name(include_path, cursor) + if not framework_name: + return + + # Classify the import type + import_type = self._classify_objc_import(include_path) + + # Add to imports + imports.add_import(framework_name, import_type) + + # Register with moniker manager for external dependencies + if import_type in ['standard_library', 'third_party'] and self.symbol_manager: + self._register_framework_dependency(framework_name, import_type, include_path) + + logger.debug(f"Processed import: {framework_name} ({import_type}) from {include_path}") + + except Exception as e: + logger.error(f"Error processing inclusion directive: {e}") + + def _extract_framework_name(self, include_path: str, cursor: 'clang.Cursor') -> Optional[str]: + """Extract framework/module name from include path.""" + try: + # Get the original spelling from the cursor (what was actually written) + spelling = cursor.spelling + if spelling: + # Remove quotes and angle brackets + clean_spelling = spelling.strip('"<>') + + # For framework imports like + if '/' in clean_spelling: + parts = clean_spelling.split('/') + if len(parts) >= 2: + framework_name = parts[0] + # Common iOS/macOS frameworks + if framework_name in ['Foundation', 'UIKit', 'CoreData', 'CoreGraphics', + 'QuartzCore', 'AVFoundation', 'CoreLocation', 'MapKit']: + return framework_name + # For other frameworks, use the framework name + return framework_name + + # For simple includes like "MyHeader.h" + header_name = clean_spelling.replace('.h', '').replace('.m', '').replace('.mm', '') + return header_name + + # Fallback: extract from full path + if '/' in include_path: + path_parts = include_path.split('/') + + # Look for .framework in path + for i, part in enumerate(path_parts): + if part.endswith('.framework') and i + 1 < len(path_parts): + return part.replace('.framework', '') + + # Look for Headers directory (common in frameworks) + if 'Headers' in path_parts: + headers_idx = path_parts.index('Headers') + if headers_idx > 0: + framework_part = path_parts[headers_idx - 1] + if framework_part.endswith('.framework'): + return framework_part.replace('.framework', '') + + # Use the filename without extension + filename = path_parts[-1] + return filename.replace('.h', '').replace('.m', '').replace('.mm', '') + + return None + + except Exception as e: + logger.debug(f"Error extracting framework name from {include_path}: {e}") + return None + + def _classify_objc_import(self, include_path: str) -> str: + """Classify Objective-C import as system, third-party, or local.""" + try: + # System frameworks (typical macOS/iOS system paths) + system_indicators = [ + '/Applications/Xcode.app/', + '/System/Library/', + '/usr/include/', + 'Platforms/iPhoneOS.platform/', + 'Platforms/iPhoneSimulator.platform/', + 'Platforms/MacOSX.platform/' + ] + + for indicator in system_indicators: + if indicator in include_path: + return 'standard_library' + + # Common system frameworks by name + system_frameworks = { + 'Foundation', 'UIKit', 'CoreData', 'CoreGraphics', 'QuartzCore', + 'AVFoundation', 'CoreLocation', 'MapKit', 'CoreAnimation', + 'Security', 'SystemConfiguration', 'CFNetwork', 'CoreFoundation', + 'AppKit', 'Cocoa', 'WebKit', 'JavaScriptCore' + } + + for framework in system_frameworks: + if f'/{framework}.framework/' in include_path or f'{framework}/' in include_path: + return 'standard_library' + + # Third-party dependency managers + third_party_indicators = [ + '/Pods/', # CocoaPods + '/Carthage/', # Carthage + '/node_modules/', # React Native + '/DerivedData/', # Sometimes used for third-party + ] + + for indicator in third_party_indicators: + if indicator in include_path: + return 'third_party' + + # Check if it's within the project directory + if hasattr(self, 'project_path') and self.project_path: + if include_path.startswith(str(self.project_path)): + return 'local' + + # Check for relative paths (usually local) + if include_path.startswith('./') or include_path.startswith('../'): + return 'local' + + # If path contains common local indicators + if any(indicator in include_path.lower() for indicator in ['src/', 'source/', 'include/', 'headers/']): + return 'local' + + # Default to third-party for unknown external dependencies + return 'third_party' + + except Exception as e: + logger.debug(f"Error classifying import {include_path}: {e}") + return 'third_party' + + def _register_framework_dependency(self, framework_name: str, import_type: str, include_path: str) -> None: + """Register framework dependency with moniker manager.""" + try: + if not self.symbol_manager: + return + + # Determine package manager based on import type and path + if import_type == 'standard_library': + manager = 'system' + elif '/Pods/' in include_path: + manager = 'cocoapods' + elif '/Carthage/' in include_path: + manager = 'carthage' + else: + manager = 'unknown' + + # Register the external symbol for the framework + self.symbol_manager.create_external_symbol( + language="objc", + package_name=framework_name, + module_path=framework_name, + symbol_name="*", # Framework-level import + version="", # Version detection could be added later + alias=None + ) + + logger.debug(f"Registered framework dependency: {framework_name} via {manager}") + + except Exception as e: + logger.error(f"Error registering framework dependency {framework_name}: {e}") + + def _map_cursor_to_symbol(self, cursor: 'clang.Cursor', symbol_name: str) -> Optional[Tuple[str, int, int]]: + """Map libclang cursor to SCIP symbol information.""" + try: + cursor_kind = cursor.kind + + # Map Objective-C specific cursors + if cursor_kind == CursorKind.OBJC_INTERFACE_DECL: + # @interface ClassName + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Class, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.OBJC_PROTOCOL_DECL: + # @protocol ProtocolName + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Interface, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.OBJC_CATEGORY_DECL: + # @interface ClassName (CategoryName) + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Class, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.OBJC_INSTANCE_METHOD_DECL: + # Instance method: - (void)methodName + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Method, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.OBJC_CLASS_METHOD_DECL: + # Class method: + (void)methodName + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Method, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.OBJC_PROPERTY_DECL: + # @property declaration + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Property, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.OBJC_IVAR_DECL: + # Instance variable + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Field, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.OBJC_IMPLEMENTATION_DECL: + # @implementation ClassName + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Class, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.OBJC_CATEGORY_IMPL_DECL: + # @implementation ClassName (CategoryName) + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Class, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.FUNCTION_DECL: + # Regular C function + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Function, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.VAR_DECL: + # Variable declaration + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.Variable, scip_pb2.SymbolRole.Definition) + + elif cursor_kind == CursorKind.TYPEDEF_DECL: + # Type definition + symbol_id = f"local {self._get_local_id_for_cursor(cursor)}" + return (symbol_id, scip_pb2.SymbolKind.TypeParameter, scip_pb2.SymbolRole.Definition) + + # Add more cursor mappings as needed + return None + + except Exception as e: + logger.error(f"Error mapping cursor {symbol_name}: {e}") + return None + + def _get_local_id(self) -> str: + """Generate unique local symbol ID.""" + self._symbol_counter += 1 + return f"objc_{self._symbol_counter}" + + def _get_local_id_for_cursor(self, cursor: 'clang.Cursor') -> str: + """Generate consistent local symbol ID based on cursor properties.""" + # Create deterministic ID based on cursor type, name, and location + cursor_type = cursor.kind.name.lower() + symbol_name = cursor.spelling or "unnamed" + line = cursor.location.line - # Create symbol information - symbol_info = self._create_symbol_information( - symbol_id, name, scip_pb2.Property, ["Objective-C property"] - ) + return f"{cursor_type}_{symbol_name}_{line}" + + def _create_occurrence(self, symbol_id: str, start_line: int, start_col: int, + end_line: int, end_col: int, symbol_roles: int) -> Optional[scip_pb2.Occurrence]: + """Create SCIP occurrence.""" + try: + occurrence = scip_pb2.Occurrence() + occurrence.symbol = symbol_id + occurrence.symbol_roles = symbol_roles + occurrence.range.start.extend([start_line, start_col]) + occurrence.range.end.extend([end_line, end_col]) + + return occurrence + + except Exception as e: + logger.error(f"Error creating occurrence: {e}") + return None + + def _create_symbol_information(self, symbol_id: str, display_name: str, symbol_kind: int) -> Optional[scip_pb2.SymbolInformation]: + """Create SCIP symbol information.""" + try: + symbol_info = scip_pb2.SymbolInformation() + symbol_info.symbol = symbol_id + symbol_info.kind = symbol_kind + symbol_info.display_name = display_name + + return symbol_info + + except Exception as e: + logger.error(f"Error creating symbol information: {e}") + return None + + def _create_symbol_information_with_relationships(self, symbol_id: str, display_name: str, symbol_kind: int, relationships: List['scip_pb2.Relationship']) -> Optional[scip_pb2.SymbolInformation]: + """Create SCIP symbol information with relationships.""" + try: + symbol_info = scip_pb2.SymbolInformation() + symbol_info.symbol = symbol_id + symbol_info.kind = symbol_kind + symbol_info.display_name = display_name + + # Add relationships if provided + if relationships: + symbol_info.relationships.extend(relationships) + + return symbol_info + + except Exception as e: + logger.error(f"Error creating symbol information with relationships: {e}") + return None + + def _extract_relationships_from_file(self, file_path: str, project_path: str) -> Dict[str, List[tuple]]: + """Extract relationships from a single Objective-C file using libclang.""" + content = self._read_file_content(file_path) + if not content: + return {} - return occurrence, symbol_info - - def _create_objc_method_symbol_for_document(self, line_num: int, name: str, is_class_method: bool, file_path: str, scope_stack: List[str]) -> tuple: - """Create an Objective-C method symbol for SCIP document.""" - symbol_id = self.symbol_manager.create_local_symbol( - language="objc", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="()." - ) + try: + # Parse with libclang + index = clang.Index.create() + translation_unit = index.parse( + file_path, + args=['-ObjC', '-x', 'objective-c'], + options=clang.TranslationUnit.PARSE_DETAILED_PROCESSING_RECORD + ) + + if not translation_unit: + return {} + + return self._extract_relationships_from_ast(translation_unit.cursor, file_path, project_path) + + except Exception as e: + logger.error(f"Error extracting relationships from {file_path}: {e}") + return {} + + def _extract_relationships_from_ast(self, cursor: 'clang.Cursor', file_path: str, project_path: str) -> Dict[str, List[tuple]]: + """Extract relationships from libclang AST.""" + relationships = {} + relative_path = self._get_relative_path(file_path, project_path) - # Create definition occurrence from line position - start_col, end_col = self.position_calculator.find_name_in_line(line_num, name) - range_obj = self.position_calculator.line_col_to_range( - line_num, start_col, line_num, end_col - ) + # Track current method context for method calls + current_method_symbol = None - occurrence = self._create_occurrence( - symbol_id, range_obj, scip_pb2.Definition, scip_pb2.IdentifierFunction - ) + def traverse_for_relationships(cursor_node, parent_method=None): + """Recursively traverse AST to find relationships.""" + nonlocal current_method_symbol + + try: + # Skip if cursor is not in our file + if not cursor_node.location.file or cursor_node.spelling == "": + pass + else: + cursor_file = str(cursor_node.location.file) + if cursor_file.endswith(os.path.basename(file_path)): + cursor_kind = cursor_node.kind + + # Track method context + if cursor_kind in (CursorKind.OBJC_INSTANCE_METHOD_DECL, CursorKind.OBJC_CLASS_METHOD_DECL): + method_symbol_id = f"local {self._get_local_id_for_cursor(cursor_node)}" + current_method_symbol = method_symbol_id + parent_method = method_symbol_id + + # Detect Objective-C method calls + elif cursor_kind == CursorKind.OBJC_MESSAGE_EXPR: + if parent_method: + # Get the method being called + called_method = self._extract_method_from_message_expr(cursor_node) + if called_method: + target_symbol_id = f"local objc_call_{called_method}_{cursor_node.location.line}" + + if parent_method not in relationships: + relationships[parent_method] = [] + relationships[parent_method].append((target_symbol_id, InternalRelationshipType.CALLS)) + + logger.debug(f"Found method call: {parent_method} -> {target_symbol_id}") + + # Detect C function calls + elif cursor_kind == CursorKind.CALL_EXPR: + if parent_method: + function_name = cursor_node.spelling + if function_name: + target_symbol_id = f"local c_func_{function_name}_{cursor_node.location.line}" + + if parent_method not in relationships: + relationships[parent_method] = [] + relationships[parent_method].append((target_symbol_id, InternalRelationshipType.CALLS)) + + logger.debug(f"Found function call: {parent_method} -> {target_symbol_id}") + + # Recursively process children + for child in cursor_node.get_children(): + traverse_for_relationships(child, parent_method) + + except Exception as e: + logger.error(f"Error processing cursor for relationships: {e}") - # Create symbol information - method_type = "Class method" if is_class_method else "Instance method" - symbol_info = self._create_symbol_information( - symbol_id, name, scip_pb2.Method, [f"Objective-C {method_type.lower()}"] - ) + # Start traversal + traverse_for_relationships(cursor) - return occurrence, symbol_info - - # Utility methods - def _find_child_by_type(self, node, node_type: str) -> Optional: - """Find first child node of the given type.""" - for child in node.children: - if child.type == node_type: - return child - return None - - def _get_node_text(self, node, content: str) -> str: - """Get text content of a node.""" - return content[node.start_byte:node.end_byte] - - def _create_occurrence(self, symbol_id: str, range_obj: scip_pb2.Range, - symbol_roles: int, syntax_kind: int) -> scip_pb2.Occurrence: - """Create a SCIP occurrence.""" - occurrence = scip_pb2.Occurrence() - occurrence.symbol = symbol_id - occurrence.symbol_roles = symbol_roles - occurrence.syntax_kind = syntax_kind - occurrence.range.CopyFrom(range_obj) - return occurrence - - def _create_symbol_information(self, symbol_id: str, display_name: str, - symbol_kind: int, documentation: List[str] = None) -> scip_pb2.SymbolInformation: - """Create SCIP symbol information.""" - symbol_info = scip_pb2.SymbolInformation() - symbol_info.symbol = symbol_id - symbol_info.display_name = display_name - symbol_info.kind = symbol_kind + return relationships + + def _extract_method_from_message_expr(self, cursor: 'clang.Cursor') -> Optional[str]: + """Extract method name from Objective-C message expression.""" + try: + # Get the selector/method name from the message expression + # This is a simplified extraction - could be enhanced + for child in cursor.get_children(): + if child.kind == CursorKind.OBJC_MESSAGE_EXPR: + return child.spelling + elif child.spelling and len(child.spelling) > 0: + # Try to get method name from any meaningful child + return child.spelling + + # Fallback: use the cursor's own spelling if available + return cursor.spelling if cursor.spelling else None + + except Exception as e: + logger.error(f"Error extracting method from message expression: {e}") + return None + + def _create_scip_relationships(self, relationships: List[tuple]) -> List['scip_pb2.Relationship']: + """Convert internal relationships to SCIP relationships.""" + scip_relationships = [] - if documentation: - symbol_info.documentation.extend(documentation) + for target_symbol, relationship_type in relationships: + try: + relationship = scip_pb2.Relationship() + relationship.symbol = target_symbol + + # Map relationship type to SCIP flags + if relationship_type == InternalRelationshipType.CALLS: + relationship.is_reference = True + elif relationship_type == InternalRelationshipType.INHERITS: + relationship.is_reference = True + elif relationship_type == InternalRelationshipType.IMPLEMENTS: + relationship.is_implementation = True + else: + relationship.is_reference = True # Default fallback + + scip_relationships.append(relationship) + + except Exception as e: + logger.error(f"Error creating SCIP relationship: {e}") + continue - return symbol_info \ No newline at end of file + return scip_relationships + + def _get_document_language(self, file_path: str) -> str: + """Get the document language identifier.""" + if file_path.endswith('.mm'): + return 'objcpp' + return 'objc' + + # Utility methods from base strategy + def _read_file_content(self, file_path: str) -> Optional[str]: + """Read file content safely.""" + try: + with open(file_path, 'r', encoding='utf-8', errors='ignore') as f: + return f.read() + except Exception as e: + logger.warning(f"Failed to read file {file_path}: {e}") + return None + + def _get_relative_path(self, file_path: str, project_path: str) -> str: + """Get relative path from project root.""" + return os.path.relpath(file_path, project_path).replace(os.sep, '/') + + def get_supported_languages(self) -> List[str]: + """Return list of supported language identifiers.""" + return ["objective-c", "objc", "objective-c-header"] + + +class StrategyError(Exception): + """Exception raised when a strategy cannot process files.""" + pass \ No newline at end of file diff --git a/src/code_index_mcp/tools/scip/scip_symbol_analyzer.py b/src/code_index_mcp/tools/scip/scip_symbol_analyzer.py index fa070ab..5bd4e31 100644 --- a/src/code_index_mcp/tools/scip/scip_symbol_analyzer.py +++ b/src/code_index_mcp/tools/scip/scip_symbol_analyzer.py @@ -121,7 +121,7 @@ def analyze_file(self, file_path: str, scip_index) -> FileAnalysis: logger.debug("Completed call relationship extraction") # Step 4: Organize results into final structure - result = self._organize_results(document, symbols) + result = self._organize_results(document, symbols, scip_index) logger.debug(f"Analysis complete: {len(result.functions)} functions, {len(result.classes)} classes") return result @@ -558,13 +558,14 @@ def _extract_call_relationships(self, document, symbols: Dict[str, SymbolDefinit logger.debug(f"Relationship extraction completed for {len(symbols)} symbols") - def _organize_results(self, document, symbols: Dict[str, SymbolDefinition]) -> FileAnalysis: + def _organize_results(self, document, symbols: Dict[str, SymbolDefinition], scip_index=None) -> FileAnalysis: """ Organize extracted symbols into final FileAnalysis structure. Args: document: SCIP document symbols: Extracted symbol definitions + scip_index: Full SCIP index for external symbol extraction Returns: FileAnalysis with organized results @@ -581,9 +582,12 @@ def _organize_results(self, document, symbols: Dict[str, SymbolDefinition]) -> F for symbol in symbols.values(): result.add_symbol(symbol) - # Extract import information + # Extract import information from occurrences self._extract_imports(document, result.imports) + # Also extract imports from external symbols (for strategies like Objective-C) + if scip_index: + self._extract_imports_from_external_symbols(scip_index, result.imports) return result @@ -842,6 +846,7 @@ def _extract_imports(self, document, imports: ImportGroup): try: seen_modules = set() + # Method 1: Extract from occurrences with Import role (traditional approach) for occurrence in document.occurrences: # Only process Import role symbols if not self._is_import_occurrence(occurrence): @@ -872,10 +877,94 @@ def _extract_imports(self, document, imports: ImportGroup): imports.add_import(module_path, 'local') seen_modules.add(module_path) - logger.debug(f"Extracted {len(seen_modules)} unique imports from SCIP data") + logger.debug(f"Extracted {len(seen_modules)} unique imports from SCIP occurrences") except Exception as e: - logger.debug(f"Error extracting imports: {e}") + logger.debug(f"Error extracting imports from occurrences: {e}") + + def _extract_imports_from_external_symbols(self, scip_index, imports: ImportGroup): + """Extract imports from SCIP index external symbols (for strategies like Objective-C).""" + try: + if not hasattr(scip_index, 'external_symbols'): + logger.debug("No external_symbols in SCIP index") + return + + seen_modules = set() + + for symbol_info in scip_index.external_symbols: + if not symbol_info.symbol: + continue + + # Parse the external symbol + parsed_symbol = self._symbol_parser.parse_symbol(symbol_info.symbol) if self._symbol_parser else None + if not parsed_symbol: + # Fallback: try to extract framework name from symbol string + framework_name = self._extract_framework_from_symbol_string(symbol_info.symbol) + if framework_name and framework_name not in seen_modules: + # Classify based on symbol pattern + import_type = self._classify_external_symbol(symbol_info.symbol) + imports.add_import(framework_name, import_type) + seen_modules.add(framework_name) + logger.debug(f"Extracted external dependency: {framework_name} ({import_type})") + continue + + # Handle based on manager type + if parsed_symbol.manager in ['system', 'unknown']: + # For Objective-C system frameworks + package_name = parsed_symbol.package + if package_name and package_name not in seen_modules: + imports.add_import(package_name, 'standard_library') + seen_modules.add(package_name) + + elif parsed_symbol.manager in ['cocoapods', 'carthage']: + # Third-party Objective-C dependencies + package_name = parsed_symbol.package + if package_name and package_name not in seen_modules: + imports.add_import(package_name, 'third_party') + seen_modules.add(package_name) + + logger.debug(f"Extracted {len(seen_modules)} unique imports from external symbols") + + except Exception as e: + logger.debug(f"Error extracting imports from external symbols: {e}") + + def _extract_framework_from_symbol_string(self, symbol_string: str) -> Optional[str]: + """Extract framework name from SCIP symbol string.""" + try: + # Handle symbols like "scip-unknown unknown Foundation Foundation *." + parts = symbol_string.split() + if len(parts) >= 4: + # The package name is typically the 3rd or 4th part + for part in parts[2:5]: # Check parts 2, 3, 4 + if part and part != 'unknown' and not part.endswith('.'): + return part + return None + except Exception: + return None + + def _classify_external_symbol(self, symbol_string: str) -> str: + """Classify external symbol as standard_library, third_party, or local.""" + try: + # Check for known system frameworks + system_frameworks = { + 'Foundation', 'UIKit', 'CoreData', 'CoreGraphics', 'QuartzCore', + 'AVFoundation', 'CoreLocation', 'MapKit', 'CoreAnimation', + 'Security', 'SystemConfiguration', 'CFNetwork', 'CoreFoundation', + 'AppKit', 'Cocoa', 'WebKit', 'JavaScriptCore' + } + + for framework in system_frameworks: + if framework in symbol_string: + return 'standard_library' + + # Check for third-party indicators + if any(indicator in symbol_string.lower() for indicator in ['cocoapods', 'carthage', 'pods']): + return 'third_party' + + return 'standard_library' # Default for external symbols + + except Exception: + return 'standard_library' def _parse_external_module(self, external_symbol: str) -> Optional[Dict[str, str]]: """Parse external SCIP symbol to extract module information.""" diff --git a/src/scripts/inspect_doc_symbols.py b/src/scripts/inspect_doc_symbols.py deleted file mode 100644 index 94ca1b2..0000000 --- a/src/scripts/inspect_doc_symbols.py +++ /dev/null @@ -1,68 +0,0 @@ -import os -import sys -import argparse - -CURRENT_DIR = os.path.dirname(__file__) -SRC_DIR = os.path.abspath(os.path.join(CURRENT_DIR, '..')) -if SRC_DIR not in sys.path: - sys.path.insert(0, SRC_DIR) - -from code_index_mcp.scip.proto import scip_pb2 - - -def normalize(p: str) -> str: - return p.replace('\\', '/') - - -def load_index(path: str) -> scip_pb2.Index: - with open(path, 'rb') as f: - data = f.read() - idx = scip_pb2.Index() - idx.ParseFromString(data) - return idx - - -def main(): - ap = argparse.ArgumentParser() - ap.add_argument('--path', required=True, help='Path to index.scip') - ap.add_argument('--file', required=True, help='Relative path to match (forward slashes)') - args = ap.parse_args() - - idx = load_index(args.path) - target = normalize(args.file) - - print(f'Total docs: {len(idx.documents)}') - - doc = None - for d in idx.documents: - if normalize(d.relative_path) == target: - doc = d - break - if doc is None: - # try case-insensitive - tl = target.lower() - for d in idx.documents: - if normalize(d.relative_path).lower() == tl: - doc = d - print('(case-insensitive hit)') - break - - if doc is None: - print('Document not found') - sys.exit(2) - - print(f'Document: {doc.relative_path} language={doc.language}') - print(f'Occurrences: {len(doc.occurrences)}') - print(f'Symbols: {len(doc.symbols)}') - - for i, s in enumerate(doc.symbols[:200]): - try: - kind_name = scip_pb2.SymbolInformation.Kind.Name(s.kind) - except Exception: - kind_name = str(s.kind) - dn = getattr(s, 'display_name', '') - print(f' [{i}] name={dn!r} kind={s.kind} ({kind_name})') - - -if __name__ == '__main__': - main() diff --git a/temp_js_end.py b/temp_js_end.py deleted file mode 100644 index a72b0cb..0000000 --- a/temp_js_end.py +++ /dev/null @@ -1 +0,0 @@ - return symbol_info \ No newline at end of file diff --git a/temp_js_strategy.py b/temp_js_strategy.py deleted file mode 100644 index 24fcf0b..0000000 --- a/temp_js_strategy.py +++ /dev/null @@ -1,535 +0,0 @@ -"""JavaScript/TypeScript SCIP indexing strategy - SCIP standard compliant.""" - -import logging -import os -from typing import List, Optional, Dict, Any, Set - -from .base_strategy import SCIPIndexerStrategy, StrategyError -from ..proto import scip_pb2 -from ..core.position_calculator import PositionCalculator -from ..core.relationship_types import InternalRelationshipType - -# Tree-sitter imports -try: - import tree_sitter - from tree_sitter_javascript import language as js_language - from tree_sitter_typescript import language as ts_language - TREE_SITTER_AVAILABLE = True -except ImportError: - TREE_SITTER_AVAILABLE = False - tree_sitter = None - js_language = None - ts_language = None - -logger = logging.getLogger(__name__) - - -class JavaScriptStrategy(SCIPIndexerStrategy): - """SCIP-compliant JavaScript/TypeScript indexing strategy using Tree-sitter.""" - - SUPPORTED_EXTENSIONS = {'.js', '.jsx', '.ts', '.tsx', '.mjs', '.cjs'} - - def __init__(self, priority: int = 95): - """Initialize the JavaScript/TypeScript strategy.""" - super().__init__(priority) - - # Initialize parsers if Tree-sitter is available - if TREE_SITTER_AVAILABLE: - try: - js_lang = tree_sitter.Language(js_language()) - ts_lang = tree_sitter.Language(ts_language()) - - self.js_parser = tree_sitter.Parser(js_lang) - self.ts_parser = tree_sitter.Parser(ts_lang) - logger.info("JavaScript strategy initialized with Tree-sitter support") - except Exception as e: - logger.warning(f"Failed to initialize Tree-sitter parsers: {e}") - self.js_parser = None - self.ts_parser = None - else: - self.js_parser = None - self.ts_parser = None - raise StrategyError("Tree-sitter not available for JavaScript/TypeScript strategy") - - def can_handle(self, extension: str, file_path: str) -> bool: - """Check if this strategy can handle the file type.""" - return extension.lower() in self.SUPPORTED_EXTENSIONS - - def get_language_name(self) -> str: - """Get the language name for SCIP symbol generation.""" - return "javascript" # Use 'javascript' for both JS and TS - - def is_available(self) -> bool: - """Check if this strategy is available.""" - if not TREE_SITTER_AVAILABLE or not self.js_parser or not self.ts_parser: - raise StrategyError("Tree-sitter not available for JavaScript/TypeScript strategy") - return True - - def _collect_symbol_definitions(self, files: List[str], project_path: str) -> None: - """Phase 1: Collect all symbol definitions from JavaScript/TypeScript files.""" - logger.debug(f"JavaScriptStrategy Phase 1: Processing {len(files)} files for symbol collection") - processed_count = 0 - error_count = 0 - - for i, file_path in enumerate(files, 1): - relative_path = os.path.relpath(file_path, project_path) - - try: - self._collect_symbols_from_file(file_path, project_path) - processed_count += 1 - - if i % 10 == 0 or i == len(files): - logger.debug(f"Phase 1 progress: {i}/{len(files)} files, last file: {relative_path}") - - except Exception as e: - error_count += 1 - logger.warning(f"Phase 1 failed for {relative_path}: {e}") - continue - - logger.info(f"Phase 1 summary: {processed_count} files processed, {error_count} errors") - - def _generate_documents_with_references(self, files: List[str], project_path: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> List[scip_pb2.Document]: - """Phase 2: Generate complete SCIP documents with resolved references.""" - documents = [] - logger.debug(f"JavaScriptStrategy Phase 2: Generating documents for {len(files)} files") - processed_count = 0 - error_count = 0 - total_occurrences = 0 - total_symbols = 0 - - for i, file_path in enumerate(files, 1): - relative_path = os.path.relpath(file_path, project_path) - - try: - document = self._analyze_js_file(file_path, project_path, relationships) - if document: - documents.append(document) - total_occurrences += len(document.occurrences) - total_symbols += len(document.symbols) - processed_count += 1 - - if i % 10 == 0 or i == len(files): - logger.debug(f"Phase 2 progress: {i}/{len(files)} files, " - f"last file: {relative_path}, " - f"{len(document.occurrences) if document else 0} occurrences") - - except Exception as e: - error_count += 1 - logger.error(f"Phase 2 failed for {relative_path}: {e}") - continue - - logger.info(f"Phase 2 summary: {processed_count} documents generated, {error_count} errors, " - f"{total_occurrences} total occurrences, {total_symbols} total symbols") - - return documents - - def _build_symbol_relationships(self, files: List[str], project_path: str) -> Dict[str, List[tuple]]: - """ - Build relationships between JavaScript/TypeScript symbols. - - Args: - files: List of file paths to process - project_path: Project root path - - Returns: - Dictionary mapping symbol_id -> [(target_symbol_id, relationship_type), ...] - """ - logger.debug(f"JavaScriptStrategy: Building symbol relationships for {len(files)} files") - - all_relationships = {} - - for file_path in files: - try: - file_relationships = self._extract_js_relationships_from_file(file_path, project_path) - all_relationships.update(file_relationships) - except Exception as e: - logger.warning(f"Failed to extract relationships from {file_path}: {e}") - - total_symbols_with_relationships = len(all_relationships) - total_relationships = sum(len(rels) for rels in all_relationships.values()) - - logger.debug(f"JavaScriptStrategy: Built {total_relationships} relationships for {total_symbols_with_relationships} symbols") - return all_relationships - - def _collect_symbols_from_file(self, file_path: str, project_path: str) -> None: - """Collect symbol definitions from a single JavaScript/TypeScript file.""" - - # Read file content - content = self._read_file_content(file_path) - if not content: - logger.debug(f"Empty file skipped: {os.path.relpath(file_path, project_path)}") - return - - # Parse with Tree-sitter - try: - tree = self._parse_js_content(content, file_path) - if not tree or not tree.root_node: - raise StrategyError(f"Failed to parse {os.path.relpath(file_path, project_path)}") - except Exception as e: - raise StrategyError(f"Parse error in {os.path.relpath(file_path, project_path)}: {e}") - - # Collect symbols using Tree-sitter - relative_path = self._get_relative_path(file_path, project_path) - self._collect_symbols_from_tree(tree, relative_path, content) - logger.debug(f"Symbol collection - {relative_path}") - - def _analyze_js_file(self, file_path: str, project_path: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> Optional[scip_pb2.Document]: - """Analyze a single JavaScript/TypeScript file and generate complete SCIP document.""" - relative_path = self._get_relative_path(file_path, project_path) - - # Read file content - content = self._read_file_content(file_path) - if not content: - logger.debug(f"Empty file skipped: {relative_path}") - return None - - # Parse with Tree-sitter - try: - tree = self._parse_js_content(content, file_path) - if not tree or not tree.root_node: - raise StrategyError(f"Failed to parse {relative_path}") - except Exception as e: - raise StrategyError(f"Parse error in {relative_path}: {e}") - - # Create SCIP document - document = scip_pb2.Document() - document.relative_path = relative_path - document.language = self.get_language_name() - - # Analyze tree and generate occurrences - self.position_calculator = PositionCalculator(content) - occurrences, symbols = self._analyze_tree_for_document(tree, relative_path, content, relationships) - - # Add results to document - document.occurrences.extend(occurrences) - document.symbols.extend(symbols) - - logger.debug(f"Document analysis - {relative_path}: " - f"-> {len(document.occurrences)} occurrences, {len(document.symbols)} symbols") - - return document - - def _extract_js_relationships_from_file(self, file_path: str, project_path: str) -> Dict[str, List[tuple]]: - """ - Extract relationships from a single JavaScript/TypeScript file. - - Args: - file_path: File to analyze - project_path: Project root path - - Returns: - Dictionary mapping symbol_id -> [(target_symbol_id, relationship_type), ...] - """ - content = self._read_file_content(file_path) - if not content: - return {} - - try: - tree = self._parse_js_content(content, file_path) - if not tree or not tree.root_node: - raise StrategyError(f"Failed to parse {file_path} for relationship extraction") - except Exception as e: - raise StrategyError(f"Parse error in {file_path}: {e}") - - return self._extract_relationships_from_tree(tree, file_path, project_path) - - def _parse_js_content(self, content: str, file_path: str): - """Parse JavaScript/TypeScript content using Tree-sitter parser.""" - if not TREE_SITTER_AVAILABLE or not self.js_parser or not self.ts_parser: - raise StrategyError("Tree-sitter not available for JavaScript/TypeScript parsing") - - # Determine parser based on file extension - extension = os.path.splitext(file_path)[1].lower() - - if extension in {'.ts', '.tsx'}: - parser = self.ts_parser - else: - parser = self.js_parser - - try: - content_bytes = content.encode('utf-8') - return parser.parse(content_bytes) - except Exception as e: - raise StrategyError(f"Failed to parse {file_path} with Tree-sitter: {e}") - - - def _collect_symbols_from_tree(self, tree, file_path: str, content: str) -> None: - """Collect symbols from Tree-sitter tree.""" - - def visit_node(node, scope_stack=[]): - node_type = node.type - - if node_type in ['function_declaration', 'method_definition', 'arrow_function']: - self._register_js_function(node, file_path, scope_stack) - elif node_type in ['class_declaration']: - self._register_js_class(node, file_path, scope_stack) - - # Recursively visit children - for child in node.children: - visit_node(child, scope_stack) - - visit_node(tree.root_node) - - def _analyze_tree_for_document(self, tree, file_path: str, content: str, relationships: Optional[Dict[str, List[tuple]]] = None) -> tuple: - """Analyze Tree-sitter tree to generate occurrences and symbols for SCIP document.""" - occurrences = [] - symbols = [] - - def visit_node(node, scope_stack=[]): - node_type = node.type - - if node_type in ['function_declaration', 'method_definition', 'arrow_function']: - name = self._get_js_function_name(node) - if name: - symbol_id = self._create_js_function_symbol_id(name, file_path, scope_stack) - occurrence = self._create_js_function_occurrence(node, symbol_id) - symbol_relationships = relationships.get(symbol_id, []) if relationships else [] - scip_relationships = self._create_scip_relationships(symbol_relationships) if symbol_relationships else [] - symbol_info = self._create_js_function_symbol_info(node, symbol_id, name, scip_relationships) - - if occurrence: - occurrences.append(occurrence) - if symbol_info: - symbols.append(symbol_info) - - elif node_type in ['class_declaration']: - name = self._get_js_class_name(node) - if name: - symbol_id = self._create_js_class_symbol_id(name, file_path, scope_stack) - occurrence = self._create_js_class_occurrence(node, symbol_id) - symbol_relationships = relationships.get(symbol_id, []) if relationships else [] - scip_relationships = self._create_scip_relationships(symbol_relationships) if symbol_relationships else [] - symbol_info = self._create_js_class_symbol_info(node, symbol_id, name, scip_relationships) - - if occurrence: - occurrences.append(occurrence) - if symbol_info: - symbols.append(symbol_info) - - # Recursively visit children - for child in node.children: - visit_node(child, scope_stack) - - visit_node(tree.root_node) - return occurrences, symbols - - def _extract_relationships_from_tree(self, tree, file_path: str, project_path: str) -> Dict[str, List[tuple]]: - """Extract relationships from Tree-sitter tree.""" - relationships = {} - - def visit_node(node, scope_stack=[]): - node_type = node.type - - if node_type == 'class_declaration': - # Extract inheritance relationships for ES6 classes - class_name = self._get_js_class_name(node) - if class_name: - class_symbol_id = self._create_js_class_symbol_id(class_name, file_path, scope_stack) - - # Look for extends clause - for child in node.children: - if child.type == 'class_heritage': - for heritage_child in child.children: - if heritage_child.type == 'identifier': - parent_name = self._get_node_text(heritage_child) - if parent_name: - parent_symbol_id = self._create_js_class_symbol_id(parent_name, file_path, scope_stack) - if class_symbol_id not in relationships: - relationships[class_symbol_id] = [] - relationships[class_symbol_id].append((parent_symbol_id, InternalRelationshipType.INHERITS)) - - elif node_type in ['function_declaration', 'method_definition', 'arrow_function']: - # Extract function call relationships - function_name = self._get_js_function_name(node) - if function_name: - function_symbol_id = self._create_js_function_symbol_id(function_name, file_path, scope_stack) - - # Find call expressions within this function - self._extract_calls_from_node(node, function_symbol_id, relationships, file_path, scope_stack) - - # Recursively visit children - for child in node.children: - visit_node(child, scope_stack) - - visit_node(tree.root_node) - return relationships - - def _extract_calls_from_node(self, node, source_symbol_id: str, relationships: Dict, file_path: str, scope_stack: List): - """Extract function calls from a node.""" - - def visit_for_calls(n): - if n.type == 'call_expression': - # Get the function being called - function_node = n.children[0] if n.children else None - if function_node: - if function_node.type == 'identifier': - target_name = self._get_node_text(function_node) - if target_name: - target_symbol_id = self._create_js_function_symbol_id(target_name, file_path, scope_stack) - if source_symbol_id not in relationships: - relationships[source_symbol_id] = [] - relationships[source_symbol_id].append((target_symbol_id, InternalRelationshipType.CALLS)) - - for child in n.children: - visit_for_calls(child) - - visit_for_calls(node) - - # Helper methods for Tree-sitter node processing - def _get_node_text(self, node) -> Optional[str]: - """Get text content of a Tree-sitter node.""" - if hasattr(node, 'text'): - try: - return node.text.decode('utf-8') - except: - pass - return None - - def _get_js_function_name(self, node) -> Optional[str]: - """Extract function name from function node.""" - for child in node.children: - if child.type == 'identifier': - return self._get_node_text(child) - return None - - def _get_js_class_name(self, node) -> Optional[str]: - """Extract class name from class node.""" - for child in node.children: - if child.type == 'identifier': - return self._get_node_text(child) - return None - - # Symbol registration and creation methods - def _register_js_function(self, node, file_path: str, scope_stack: List[str]) -> None: - """Register a JavaScript function symbol definition.""" - name = self._get_js_function_name(node) - if not name: - return - - symbol_id = self.symbol_manager.create_local_symbol( - language="javascript", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="()." - ) - - # Create a dummy range for registration - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) - - self.reference_resolver.register_symbol_definition( - symbol_id=symbol_id, - file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.Function, - display_name=name, - documentation=["JavaScript function"] - ) - - def _register_js_class(self, node, file_path: str, scope_stack: List[str]) -> None: - """Register a JavaScript class symbol definition.""" - name = self._get_js_class_name(node) - if not name: - return - - symbol_id = self.symbol_manager.create_local_symbol( - language="javascript", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - - # Create a dummy range for registration - dummy_range = scip_pb2.Range() - dummy_range.start.extend([0, 0]) - dummy_range.end.extend([0, 1]) - - self.reference_resolver.register_symbol_definition( - symbol_id=symbol_id, - file_path=file_path, - definition_range=dummy_range, - symbol_kind=scip_pb2.Class, - display_name=name, - documentation=["JavaScript class"] - ) - - def _create_js_function_symbol_id(self, name: str, file_path: str, scope_stack: List[str]) -> str: - """Create symbol ID for JavaScript function.""" - return self.symbol_manager.create_local_symbol( - language="javascript", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="()." - ) - - def _create_js_class_symbol_id(self, name: str, file_path: str, scope_stack: List[str]) -> str: - """Create symbol ID for JavaScript class.""" - return self.symbol_manager.create_local_symbol( - language="javascript", - file_path=file_path, - symbol_path=scope_stack + [name], - descriptor="#" - ) - - def _create_js_function_occurrence(self, node, symbol_id: str) -> Optional[scip_pb2.Occurrence]: - """Create SCIP occurrence for JavaScript function.""" - if not self.position_calculator: - return None - - try: - # Convert Tree-sitter node to range (simplified) - range_obj = scip_pb2.Range() - range_obj.start.extend([node.start_point[0], node.start_point[1]]) - range_obj.end.extend([node.end_point[0], node.end_point[1]]) - - occurrence = scip_pb2.Occurrence() - occurrence.symbol = symbol_id - occurrence.symbol_roles = scip_pb2.Definition - occurrence.syntax_kind = scip_pb2.IdentifierFunction - occurrence.range.CopyFrom(range_obj) - return occurrence - except: - return None - - def _create_js_class_occurrence(self, node, symbol_id: str) -> Optional[scip_pb2.Occurrence]: - """Create SCIP occurrence for JavaScript class.""" - if not self.position_calculator: - return None - - try: - # Convert Tree-sitter node to range (simplified) - range_obj = scip_pb2.Range() - range_obj.start.extend([node.start_point[0], node.start_point[1]]) - range_obj.end.extend([node.end_point[0], node.end_point[1]]) - - occurrence = scip_pb2.Occurrence() - occurrence.symbol = symbol_id - occurrence.symbol_roles = scip_pb2.Definition - occurrence.syntax_kind = scip_pb2.IdentifierType - occurrence.range.CopyFrom(range_obj) - return occurrence - except: - return None - - def _create_js_function_symbol_info(self, node, symbol_id: str, name: str, relationships: Optional[List[scip_pb2.Relationship]] = None) -> scip_pb2.SymbolInformation: - """Create SCIP symbol information for JavaScript function.""" - symbol_info = scip_pb2.SymbolInformation() - symbol_info.symbol = symbol_id - symbol_info.display_name = name - symbol_info.kind = scip_pb2.Function - symbol_info.documentation.append("JavaScript function") - if relationships and self.relationship_manager: - self.relationship_manager.add_relationships_to_symbol(symbol_info, relationships) - return symbol_info - - def _create_js_class_symbol_info(self, node, symbol_id: str, name: str, relationships: Optional[List[scip_pb2.Relationship]] = None) -> scip_pb2.SymbolInformation: - """Create SCIP symbol information for JavaScript class.""" - symbol_info = scip_pb2.SymbolInformation() - symbol_info.symbol = symbol_id - symbol_info.display_name = name - symbol_info.kind = scip_pb2.Class - symbol_info.documentation.append("JavaScript class") - if relationships and self.relationship_manager: - self.relationship_manager.add_relationships_to_symbol(symbol_info, relationships) - return symbol_info diff --git a/uv.lock b/uv.lock index f61a86d..a2c9dde 100644 --- a/uv.lock +++ b/uv.lock @@ -49,14 +49,14 @@ wheels = [ [[package]] name = "code-index-mcp" -version = "2.1.0" +version = "2.1.2" source = { editable = "." } dependencies = [ + { name = "libclang" }, { name = "mcp" }, { name = "pathspec" }, { name = "protobuf" }, { name = "tree-sitter" }, - { name = "tree-sitter-c" }, { name = "tree-sitter-java" }, { name = "tree-sitter-javascript" }, { name = "tree-sitter-typescript" }, @@ -66,11 +66,11 @@ dependencies = [ [package.metadata] requires-dist = [ + { name = "libclang", specifier = ">=16.0.0" }, { name = "mcp", specifier = ">=0.3.0" }, { name = "pathspec", specifier = ">=0.12.1" }, { name = "protobuf", specifier = ">=4.21.0" }, { name = "tree-sitter", specifier = ">=0.20.0" }, - { name = "tree-sitter-c", specifier = ">=0.20.0" }, { name = "tree-sitter-java", specifier = ">=0.20.0" }, { name = "tree-sitter-javascript", specifier = ">=0.20.0" }, { name = "tree-sitter-typescript", specifier = ">=0.20.0" }, @@ -151,6 +151,23 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/76/c6/c88e154df9c4e1a2a66ccf0005a88dfb2650c1dffb6f5ce603dfbd452ce3/idna-3.10-py3-none-any.whl", hash = "sha256:946d195a0d259cbba61165e88e65941f16e9b36ea6ddb97f00452bae8b1287d3", size = 70442 }, ] +[[package]] +name = "libclang" +version = "18.1.1" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/6e/5c/ca35e19a4f142adffa27e3d652196b7362fa612243e2b916845d801454fc/libclang-18.1.1.tar.gz", hash = "sha256:a1214966d08d73d971287fc3ead8dfaf82eb07fb197680d8b3859dbbbbf78250", size = 39612 } +wheels = [ + { url = "https://files.pythonhosted.org/packages/4b/49/f5e3e7e1419872b69f6f5e82ba56e33955a74bd537d8a1f5f1eff2f3668a/libclang-18.1.1-1-py2.py3-none-macosx_11_0_arm64.whl", hash = "sha256:0b2e143f0fac830156feb56f9231ff8338c20aecfe72b4ffe96f19e5a1dbb69a", size = 25836045 }, + { url = "https://files.pythonhosted.org/packages/e2/e5/fc61bbded91a8830ccce94c5294ecd6e88e496cc85f6704bf350c0634b70/libclang-18.1.1-py2.py3-none-macosx_10_9_x86_64.whl", hash = "sha256:6f14c3f194704e5d09769108f03185fce7acaf1d1ae4bbb2f30a72c2400cb7c5", size = 26502641 }, + { url = "https://files.pythonhosted.org/packages/db/ed/1df62b44db2583375f6a8a5e2ca5432bbdc3edb477942b9b7c848c720055/libclang-18.1.1-py2.py3-none-macosx_11_0_arm64.whl", hash = "sha256:83ce5045d101b669ac38e6da8e58765f12da2d3aafb3b9b98d88b286a60964d8", size = 26420207 }, + { url = "https://files.pythonhosted.org/packages/1d/fc/716c1e62e512ef1c160e7984a73a5fc7df45166f2ff3f254e71c58076f7c/libclang-18.1.1-py2.py3-none-manylinux2010_x86_64.whl", hash = "sha256:c533091d8a3bbf7460a00cb6c1a71da93bffe148f172c7d03b1c31fbf8aa2a0b", size = 24515943 }, + { url = "https://files.pythonhosted.org/packages/3c/3d/f0ac1150280d8d20d059608cf2d5ff61b7c3b7f7bcf9c0f425ab92df769a/libclang-18.1.1-py2.py3-none-manylinux2014_aarch64.whl", hash = "sha256:54dda940a4a0491a9d1532bf071ea3ef26e6dbaf03b5000ed94dd7174e8f9592", size = 23784972 }, + { url = "https://files.pythonhosted.org/packages/fe/2f/d920822c2b1ce9326a4c78c0c2b4aa3fde610c7ee9f631b600acb5376c26/libclang-18.1.1-py2.py3-none-manylinux2014_armv7l.whl", hash = "sha256:cf4a99b05376513717ab5d82a0db832c56ccea4fd61a69dbb7bccf2dfb207dbe", size = 20259606 }, + { url = "https://files.pythonhosted.org/packages/2d/c2/de1db8c6d413597076a4259cea409b83459b2db997c003578affdd32bf66/libclang-18.1.1-py2.py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:69f8eb8f65c279e765ffd28aaa7e9e364c776c17618af8bff22a8df58677ff4f", size = 24921494 }, + { url = "https://files.pythonhosted.org/packages/0b/2d/3f480b1e1d31eb3d6de5e3ef641954e5c67430d5ac93b7fa7e07589576c7/libclang-18.1.1-py2.py3-none-win_amd64.whl", hash = "sha256:4dd2d3b82fab35e2bf9ca717d7b63ac990a3519c7e312f19fa8e86dcc712f7fb", size = 26415083 }, + { url = "https://files.pythonhosted.org/packages/71/cf/e01dc4cc79779cd82d77888a88ae2fa424d93b445ad4f6c02bfc18335b70/libclang-18.1.1-py2.py3-none-win_arm64.whl", hash = "sha256:3f0e1f49f04d3cd198985fea0511576b0aee16f9ff0e0f0cad7f9c57ec3c20e8", size = 22361112 }, +] + [[package]] name = "mcp" version = "1.4.1" @@ -381,21 +398,6 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/ce/33/3591e7b22dd49f46ae4fdee1db316ecefd0486cae880c5b497a55f0ccb24/tree_sitter-0.25.1-cp314-cp314-win_arm64.whl", hash = "sha256:f7b68f584336b39b2deab9896b629dddc3c784170733d3409f01fe825e9c04eb", size = 117376 }, ] -[[package]] -name = "tree-sitter-c" -version = "0.24.1" -source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/f1/f5/ba8cd08d717277551ade8537d3aa2a94b907c6c6e0fbcf4e4d8b1c747fa3/tree_sitter_c-0.24.1.tar.gz", hash = "sha256:7d2d0cda0b8dda428c81440c1e94367f9f13548eedca3f49768bde66b1422ad6", size = 228014 } -wheels = [ - { url = "https://files.pythonhosted.org/packages/15/c7/c817be36306e457c2d36cc324789046390d9d8c555c38772429ffdb7d361/tree_sitter_c-0.24.1-cp310-abi3-macosx_10_9_x86_64.whl", hash = "sha256:9c06ac26a1efdcc8b26a8a6970fbc6997c4071857359e5837d4c42892d45fe1e", size = 80940 }, - { url = "https://files.pythonhosted.org/packages/7a/42/283909467290b24fdbc29bb32ee20e409a19a55002b43175d66d091ca1a4/tree_sitter_c-0.24.1-cp310-abi3-macosx_11_0_arm64.whl", hash = "sha256:942bcd7cbecd810dcf7ca6f8f834391ebf0771a89479646d891ba4ca2fdfdc88", size = 86304 }, - { url = "https://files.pythonhosted.org/packages/94/53/fb4f61d4e5f15ec3da85774a4df8e58d3b5b73036cf167f0203b4dd9d158/tree_sitter_c-0.24.1-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9a74cfd7a11ca5a961fafd4d751892ee65acae667d2818968a6f079397d8d28c", size = 109996 }, - { url = "https://files.pythonhosted.org/packages/5e/e8/fc541d34ee81c386c5453c2596c1763e8e9cd7cb0725f39d7dfa2276afa4/tree_sitter_c-0.24.1-cp310-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a6a807705a3978911dc7ee26a7ad36dcfacb6adfc13c190d496660ec9bd66707", size = 98137 }, - { url = "https://files.pythonhosted.org/packages/32/c6/d0563319cae0d5b5780a92e2806074b24afea2a07aa4c10599b899bda3ec/tree_sitter_c-0.24.1-cp310-abi3-musllinux_1_2_x86_64.whl", hash = "sha256:789781afcb710df34144f7e2a20cd80e325114b9119e3956c6bd1dd2d365df98", size = 94148 }, - { url = "https://files.pythonhosted.org/packages/50/5a/6361df7f3fa2310c53a0d26b4702a261c332da16fa9d801e381e3a86e25f/tree_sitter_c-0.24.1-cp310-abi3-win_amd64.whl", hash = "sha256:290bff0f9c79c966496ebae45042f77543e6e4aea725f40587a8611d566231a8", size = 84703 }, - { url = "https://files.pythonhosted.org/packages/22/6a/210a302e8025ac492cbaea58d3720d66b7d8034c5d747ac5e4d2d235aa25/tree_sitter_c-0.24.1-cp310-abi3-win_arm64.whl", hash = "sha256:d46bbda06f838c2dcb91daf767813671fd366b49ad84ff37db702129267b46e1", size = 82715 }, -] - [[package]] name = "tree-sitter-java" version = "0.23.5"