
Node Details
- Name: codeTextSplitter
- Type: CodeTextSplitter
- Version: 1.0
- Category: Text Splitters
Parameters
1. Language
- Type: Options
- Description: The programming language of the code to be split.
-
Options:
- cpp
- go
- java
- js
- php
- proto
- python
- rst
- ruby
- rust
- scala
- swift
- markdown
- latex
- html
- sol
2. Chunk Size
- Type: Number
- Default: 1000
- Optional: Yes
- Description: The number of characters in each chunk. This determines the size of the text segments after splitting.
3. Chunk Overlap
- Type: Number
- Default: 200
- Optional: Yes
- Description: The number of characters to overlap between chunks. This helps maintain context between split segments.
Input/Output
Input
The node expects code or text input in the specified language.Output
The node outputs split text chunks based on the specified parameters and language-specific syntax.Usage
This node is particularly useful in workflows that involve processing or analyzing code, such as:- Code summarization
- Code analysis tasks
- Preparing code for language models
- Splitting large codebases for easier processing
Implementation Details
The node uses theRecursiveCharacterTextSplitter.fromLanguage()
method from LangChain, which applies language-specific splitting rules. This method is more intelligent than a simple character-based split, as it attempts to split at appropriate syntactic boundaries for the given language.