不在映射中的字段包含在ElasticSearch返回的搜索结果中

我想使用Tire gem作为ElasticSearch的客户端索引pdf附件。 在我的映射中,我从_source中排除了附件字段,因此附件不会存储在索引中,也不会在搜索结果中返回

mapping :_source => { :excludes => ['attachment_original'] } do indexes :id, :type => 'integer' indexes :folder_id, :type => 'integer' indexes :attachment_file_name indexes :attachment_updated_at, :type => 'date' indexes :attachment_original, :type => 'attachment' end 

运行以下curl命令时,我仍然可以看到搜索结果中包含的附件​​内容:

 curl -X POST "http://localhost:9200/user_files/user_file/_search?pretty=true" -d '{ "query": { "query_string": { "query": "rspec" } } }' 

我在这个主题中发布了我的问题:

但我刚刚注意到,不仅附件包含在搜索结果中,而且所有其他字段(包括未映射的字段)也包括在内,您可以在此处看到:

 { "took": 20, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 0.025427073, "hits": [ { "_index": "user_files", "_type": "user_file", "_id": "5", "_score": 0.025427073, "_source": { "user_file": { "id": 5, "folder_id": 1, "updated_at": "2012-08-16T11:32:41Z", "attachment_file_size": 179895, "attachment_updated_at": "2012-08-16T11:32:41Z", "attachment_file_name": "hw4.pdf", "attachment_content_type": "application/pdf", "created_at": "2012-08-16T11:32:41Z", "attachment_original": "JVBERi0xLjQKJeLjz9MKNyA" } } } ] } } 

attachment_file_sizeattachment_content_type未在映射中定义,但在搜索结果中返回:

 { "id": 5, "folder_id": 1, "updated_at": "2012-08-16T11:32:41Z", "attachment_file_size": 179895, <--------------------- "attachment_updated_at": "2012-08-16T11:32:41Z", "attachment_file_name": "hw4.pdf", <------------------ "attachment_content_type": "application/pdf", "created_at": "2012-08-16T11:32:41Z", "attachment_original": "JVBERi0xLjQKJeLjz9MKNyA" } 

这是我的完整实现:

  include Tire::Model::Search include Tire::Model::Callbacks def self.search(folder, params) tire.search() do query { string params[:query], default_operator: "AND"} if params[:query].present? #filter :term, folder_id: folder.id #highlight :attachment_original, :options => {:tag => ""} raise to_curl end end mapping :_source => { :excludes => ['attachment_original'] } do indexes :id, :type => 'integer' indexes :folder_id, :type => 'integer' indexes :attachment_file_name indexes :attachment_updated_at, :type => 'date' indexes :attachment_original, :type => 'attachment' end def to_indexed_json to_json(:methods => [:attachment_original]) end def attachment_original if attachment_file_name.present? path_to_original = attachment.path Base64.encode64(open(path_to_original) { |f| f.read }) end end 

有人可以帮我弄清楚为什么所有字段都包含在_source

编辑:这是运行localhost:9200/user_files/_mapping

 { "user_files": { "user_file": { "_source": { "excludes": [ "attachment_original" ] }, "properties": { "attachment_content_type": { "type": "string" }, "attachment_file_name": { "type": "string" }, "attachment_file_size": { "type": "long" }, "attachment_original": { "type": "attachment", "path": "full", "fields": { "attachment_original": { "type": "string" }, "author": { "type": "string" }, "title": { "type": "string" }, "name": { "type": "string" }, "date": { "type": "date", "format": "dateOptionalTime" }, "keywords": { "type": "string" }, "content_type": { "type": "string" } } }, "attachment_updated_at": { "type": "date", "format": "dateOptionalTime" }, "created_at": { "type": "date", "format": "dateOptionalTime" }, "folder_id": { "type": "integer" }, "id": { "type": "integer" }, "updated_at": { "type": "date", "format": "dateOptionalTime" } } } } } 

如您所见,由于某种原因,所有字段都包含在映射中!

to_indexed_json ,包含attachment_original方法,因此将其发送到elasticsearch。 这也是为什么所有其他属性都包含在映射中,因此也就是源代码的原因。

有关该主题的更多信息,请参阅ElasticSearch&Tire:Using Mapping和to_indexed_json问题。

似乎Tire确实向Tire.configure { logger STDERR, level: "debug" }发送了正确的映射JSON – 我的建议是使用Tire.configure { logger STDERR, level: "debug" }来检查正在发生的事情,并检查原始级别上的问题。