卧槽,簡單的Django ElasticSearch Haystack我竟然調了那麼久。。。

卧槽,簡單的Django ElasticSearch Haystack我竟然調了那麼久。。。

來自專欄 Excalibur

最近一直在看django by example, 但由於這個教程是很久之前的了,而且搜索功能也是用solr去做的,然後去網上找代碼也基本上沒有,想了想,算了,乾脆自己寫了,應該也不難,但沒想到搞了這麼久,不過最終還是寫出來了。。。

首先這個haystack只支持1.x和2.x版本的els, 所以不要裝錯了,我裝了2.x的els, 當然els安裝之前肯定要裝java的,首先settings.py設置如下:

INSTALLED_APPS = [ django.contrib.admin, django.contrib.auth, django.contrib.contenttypes, django.contrib.sessions, django.contrib.messages, django.contrib.staticfiles, haystack, blog, taggit, django.contrib.sites, django.contrib.sitemaps,]HAYSTACK_CONNECTIONS = { default: { ENGINE: haystack.backends.elasticsearch2_backend.Elasticsearch2SearchEngine, URL: http://127.0.0.1:9200/, INDEX_NAME: blog_search, },}HAYSTACK_SIGNAL_PROCESSOR = haystack.signals.RealtimeSignalProcessor

由於我是想搜索博客的內容,models.py中的Post資料庫欄位如下:

class Post(models.Model): STATUS_CHOICES = ( (draft, Draft), (published, Published), ) title = models.CharField(max_length=250) slug = models.SlugField(max_length=250, unique_for_date=publish) author = models.ForeignKey(User, related_name=blog_posts, on_delete=models.CASCADE) body = models.TextField() publish = models.DateTimeField(default=timezone.now) created = models.DateTimeField(auto_now_add=True) updated = models.DateTimeField(auto_now=True) status = models.CharField(max_length=10, choices=STATUS_CHOICES, default=draft) objects = models.Manager() published_objects = PublishedManager() tags = TaggableManager() class Meta: ordering = (-publish,) def __str__(self): return self.title def get_absolute_url(self): return reverse(blog:post_detail, args=[self.publish.year, self.publish.strftime(%m), self.publish.strftime(%d), self.slug])

然後forms.py我繼承了haystack.forms的SearchForm

class BlogSearchForm(SearchForm): def no_query_found(self): return self.searchqueryset.all()

接下來是search_indexes.py, 這個是必須要寫的

from haystack import indexesfrom .models import Postclass PostIndex(indexes.SearchIndex, indexes.Indexable): text = indexes.CharField(document=True, use_template=True) title = indexes.CharField(model_attr=title) slug = indexes.CharField(model_attr=slug) body = indexes.CharField(model_attr=body) def get_model(self): return Post

這裡的text欄位基本上規範都是這樣,然後有use_template=True, 這樣就允許用數據模板去建立搜索引擎去索引文件。

然後要去建立模板文件,路徑為templates/search/indexes/blog/post_text.txt,這樣搜索引擎才知道根據什麼欄位去建立索引,

{{object.title}}{{object.slug}}{{object.body}}

然後可以敲命令來建立索引了,python manage.py rebuild_index, 成功以後開始寫views.py

def post_search(request): form = BlogSearchForm() cd = None results = None total_results = None if q in request.GET: form = BlogSearchForm(request.GET) if form.is_valid(): cd = form.cleaned_data results = SearchQuerySet().models(Post).filter(content=cd[q]).load_all() total_results = results.count() return render(request, search/search.html, {form: form, cd: cd, results: results, total_results: total_results})

這裡有一些調試小技巧:有時當你並知道搜索出來的東西有什麼屬性的時候,可以用print(dir())的方法去調試,不然在框架里實際上很難去調整。

我一開始views.py的代碼實際上是這樣寫的:

def post_search(request): form = BlogSearchForm(request.GET) # 一定要將form傳到html中,否則並不會將表單顯示出來 blogs = form.search() print(dir(blogs[0])) return render(request, search/search.html, {blogs: blogs, form: form})

然後els的普通查詢是q=query這種形式的,與原教程solr的query有點區別。

最後把views.py裡面變數傳到html就可以啦,所以search.html是這樣的:

{% extends "blog/base.html" %}{% block title %}Search{% endblock %}{% block content %} {% if "q" in request.GET %} <h1>Posts containing "{{ cd.q }}"</h1> <h3>Found {{ total_results }} result{{ total_results|pluralize }}</h3> {% for result in results %} {% with post=result.object %} <h4><a href="{{ post.get_absolute_url }}">{{ post.title }}</a></h4> {{ post.body|truncatewords:5 }} {% endwith %} {% empty %} <p>There are no results for your query.</p> {% endfor %} <p><a href="{% url "blog:post_search" %}">Search again</a></p> {% else %} <h1>Search for posts</h1> <form action="" method="get"> {{ form.as_p }} <input type="submit" value="Search"> </form> {% endif %}{% endblock %}

最後代碼已經放到我的github上面了

willwinworld/web-related?

github.com圖標

至此,django by example的前3章就完美搞定了,其實還是要弄清楚數據是怎麼流動的,api摸摸肯定會越來越熟練。還要補充一點的是,每次新建索引的時候,可能需要重複的刪除els裡面的內容,所以在git客戶端里可以一次性將所有的doc刪除完,命令是curl -X DELETE localhost:9200/_all

推薦閱讀:

ES官方調優指南翻譯
elasticsearch-數據自動刪除
Elastic Stack 5.0升級踩坑記
Elasticsearch Logstash Kibana(ELK)代碼和知識點總結(三)
elasticsearch,我用ik分詞,搜索"寶馬2012",怎樣只查出即包含「寶馬」又包含「2012」的文章?

TAG:Elasticsearch | Django框架 |