Skip to content

Indexing PDFs and other binary files #1026

@dsteinkopf

Description

@dsteinkopf

Hello,

have you ever thought about adding content from pdf files and other bin files to the lucene search index?
I think using a library like Apache Tika could make this not too difficult.

BTW. Is there any reason why the file names itself are not indexed?

Background: I am thinking about using git/gitblit as a document archive for PDFs and having a full text search index would be great.

Any thoughts?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions