1、团队名称、团队成员介绍、任务分配,团队成员课程设计博客链接
2、项目简介,涉及技术
基于学院网站的搜索引擎,可以对学院网站进行抓取、建索、搜索、摘要显示、按时间范围搜索
参考项目:Java团队课程设计——基于学院的搜索引擎(18级学长们真nb,尽力做了)
涉及技术:
- Jsoup
- HTML+CSS,javascript
- jQuery&jQuery-UI
- Bootstrap5
- Elasticsearch
- IK analyzer
- Servlet
- JSP
- Maven、Git
- Windows
3、本项目的git地址。
https://github.com/lrui1/LCZ-SearchEngine
4、项目git提交记录截图


5、前期调查
5.1 搜索主页界面

5.2 搜索结果界面

6、主要功能流程图

7、面向对象设计类图
爬虫模块

Elasticsearch模块

GUI模块

8、项目运行截图

9、项目关键代码分模块描述
爬虫模块
爬取计算机工程学院网站所有class内容
1 2 3 4 5 6
| Set<String> classSet = new HashSet<String>(); Elements div = document.getElementsByTag("div"); for (Element element : div) { String aClass = element.attr("class"); if (aClass != ""){ classSet.add(aClass);} }
|
解析网站,获取ResultEntry的内容
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| Set<ResultEntry> backSet=new HashSet<>(); Elements select = connection.select(selection); for (Element element : select) { Elements a = element.getElementsByTag("a"); ResultEntry e=new ResultEntry(); String href = a.attr("href"); e.setUrl("http://cec.jmu.edu.cn/"+href); String title = a.attr("title");e.setTitle(title); try { String text = Jsoup.connect(e.getUrl()).get().text();e.setText(text); String declearTime = getDeclearTime(Jsoup.connect(e.getUrl()).get()); e.setDeclareTime(declearTime); backSet.add(e); }catch (Exception ex){ continue; } } List<ResultEntry>backList=new ArrayList<>(backSet); return backList;
|
获取发布时间
1 2 3 4 5
| String text = connection.select("div.er_right_xnew_date").text(); int indexOf = text.indexOf("时间:"); int suffixNum=3; if (indexOf != -1) {return (text.substring(indexOf + suffixNum));} else {return null;}
|
函数返回的数组合并到输出数组
1 2 3 4
| for (ResultEntry resultEntry : addList) { printList.add(resultEntry); } return printList;
|
Elasticsearch模块
Elasticsearch Java API Client 连接
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| public static ElasticsearchClient getConnect() { final CredentialsProvider credentialsProvider = new BasicCredentialsProvider(); credentialsProvider.setCredentials( AuthScope.ANY, new UsernamePasswordCredentials(USERNAME, PASSWORD)); RestClientBuilder builder = RestClient.builder(new HttpHost(URL, PORT)) .setHttpClientConfigCallback(httpAsyncClientBuilder -> httpAsyncClientBuilder .setDefaultCredentialsProvider(credentialsProvider)); restClient = builder.build(); transport = new RestClientTransport( restClient, new JacksonJsonpMapper()); return new ElasticsearchClient(transport); }
|
创建索引,写入mapping
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
| public boolean newIndex(Reader reader) { CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder() .withJson(reader) .index(EsUtil.index) .build();
CreateIndexResponse response = null; try { response = client.indices().create(createIndexRequest); } catch (Exception e) { } if(response != null) { return Objects.requireNonNullElse(response.acknowledged(), false); } else { return false; } }
public static void main(String[] args) { Reader reader = new StringReader("{\n" + " \"mappings\": {\n" + " \"properties\": {\n" + " \"url\":{\"type\": \"keyword\"},\n" + " \"title\":{\n" + " \"type\": \"text\",\n" + " \"analyzer\": \"ik_max_word\", \n" + " \"fields\": {\n" + " \"suggest\": {\n" + " \"type\": \"completion\",\n" + " \"analyzer\": \"ik_max_word\"\n" + " }\n" + " }\n" + " },\n" + " \"text\":{\n" + " \"type\": \"text\",\n" + " \"analyzer\": \"ik_max_word\"\n" + " },\n" + " \"declareTime\": {\n" + " \"type\": \"date\"\n" + " }\n" + " }\n" + " }\n" + "}"); boolean newBool = search.newIndex(reader); }
|
添加文档
1 2 3 4 5 6 7 8 9
| public ResultEntry add(ResultEntry entry) { try { client.index(i -> i .index(EsUtil.index).document(entry)); } catch (IOException e) { return null; } return entry; }
|
全文检索(默认第一页)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| public List<ResultEntry> search(String searchText, int page) { int value = (page - 1) * 10; SearchResponse<ResultEntry> search = null;
try { search = client.search(s -> s .index(EsUtil.index) .query(q -> q .multiMatch(m -> m .query(searchText) .fields("title", "text") .analyzer("ik_smart"))) .highlight(h -> h .preTags("<span class=\"hit-result\">") .postTags("</span>") .fields("title", builder -> builder) .fields("text", builder -> builder)) .from(value) .size(10) , ResultEntry.class); } catch (IOException e) { e.printStackTrace(); }
return dealSearchResponse(search); }
|
根据时间检索
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| public List<ResultEntry> search(String searchText, int page, String beginDate) { int value = (page - 1) * 10; JsonData jsonBeginDate = JsonData.of(beginDate); SearchResponse<ResultEntry> search = null; try { search = client.search(s -> s .index(EsUtil.index) .query(q -> q .bool(b -> b .must(b1 -> b1 .multiMatch(b2 -> b2 .query(searchText) .fields("title", "text") .analyzer("ik_smart"))) .filter(b3 -> b3 .range(b4 -> b4 .field("declareTime") .gte(jsonBeginDate))))) .highlight(h -> h .preTags("<span class=\"hit-result\">") .postTags("</span>") .fields("title", builder -> builder) .fields("text", builder -> builder)) .from(value) .size(10) , ResultEntry.class); } catch (IOException e) { e.printStackTrace(); }
return dealSearchResponse(search); }
|
Web前端
搜索提示,用户输入时异步请求SearchSuggest,返回的数据使用jQuery-UI autocomplete 呈现
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| $(function () { $(".search-input").autocomplete({ source: function( request, response ) { var input = $(".search-input").val(); var source = ""; $.ajax({ type : "get", url : "SearchSuggest", datatype : "json", data: {"input": input}, async : false, error : function() { console.error("Load recommand data failed!"); }, success : function(data) { source = data; } }); response(JSON.parse(decodeURI(source))); } }) })
|
翻页功能,使用jqPaginator,当用户选择页数时,使用GET请求跳转至search.jsp页面显示结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| $("#my-pagination").jqPaginator({ totalPages: <%=searchCount/10+1%>, visiblePages: 10, currentPage: <%=currentPage%>, first: '<li class="first page-item"><a class="page-link" href="javascript:;">首页</a></li>', prev: '<li class="prev page-item"><a class="page-link" href="javascript:;">上一页</a></li>', next: '<li class="next page-item"><a class="page-link" href="javascript:;">下一页</a></li>', last: '<li class="last page-item"><a class="page-link" href="javascript:;">末页</a></li>', page: '<li class="page page-item"><a class="page-link" href="javascript:;">{{page}}</a></li>', onPageChange: function (num, type) { $('#my-pagination-text').html('当前第' + num + '页'); if("change" == type) { let inputText = getQueryVariable("inputText"); let beginDate = getQueryVariable("beginDate"); if(beginDate != "") { window.location.href = "search.jsp?inputText="+inputText+"&page="+num+"&beginDate="+beginDate; } else { window.location.href = "search.jsp?inputText="+inputText+"&page="+num; } } } });
|
时间选择,通过监听下拉菜单的每个选项,通过不同的GET请求去请求search.jsp
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
| $(function () { $("#range-all").click(function () { let inputText = getQueryVariable("inputText"); window.location.href = "search.jsp?inputText="+inputText; }); $("#range-week").click(function () { let myDate = new Date(); myDate.setDate(myDate.getDate() - 7); let beginDate = myGetDate(myDate); let inputText = getQueryVariable("inputText"); window.location.href = "search.jsp?inputText="+inputText+"&beginDate="+beginDate; }); $("#range-month").click(function () { let myDate = new Date(); myDate.setMonth(myDate.getMonth() - 1); let beginDate = myGetDate(myDate); let inputText = getQueryVariable("inputText"); window.location.href = "search.jsp?inputText="+inputText+"&beginDate="+beginDate; }); $("#range-year").click(function () { let myDate = new Date(); myDate.setFullYear(myDate.getFullYear() - 1); let beginDate = myGetDate(myDate); let inputText = getQueryVariable("inputText"); window.location.href = "search.jsp?inputText="+inputText+"&beginDate="+beginDate; }); });
|
输出搜索结果,获取结果的高亮标签的下标,利用下标规划输出摘要
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
| <% for(ResultEntry resultEntry : searchResult) { out.println("<div class=\"col col-lg-7 mt-3\">"); out.println("<a class=\"address\" href="+resultEntry.getUrl()+" target=\"_blank\">"+resultEntry.getTitle()+"</a>"); int front = resultEntry.getText().indexOf("<span"); int tail = resultEntry.getText().lastIndexOf("span>"); front -= 10; tail += 25; if(front < 0) { front = 0; } if(tail > resultEntry.getText().length() - 1) { tail = resultEntry.getText().length() - 1; do { if(!"\"".equals(resultEntry.getText().charAt(tail))) { tail++; break; } tail--; }while (tail > 0); } String content = resultEntry.getText().substring(front, tail); out.println("<div class=\"content\">"+content+"</div>"); out.println("<a href="+resultEntry.getUrl()+" style=\"font-size: small; color: gray\">"+resultEntry.getUrl()+"</a>"); out.println("</div>"); } %>
|
GUI模块
页面初始化,搜索并打开结果页面
1 2 3 4 5 6 7 8 9 10 11
| private void searchActionPerformed(java.awt.event.ActionEvent evt) { Page.page=1; String text = input.getText(); Input.read(text);
Result result = new Result(); result.setVisible(true); this.setVisible(false); this.dispose(); }
|
获得输入内容,展示初次搜索结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| private static List<ResultEntry> results; private static String input; public static void read(String text){ Search search = new EsSearch(); results = search.search(text); search.close(); input = text;
} public static String getText(){
return input; }
public static List<ResultEntry> getResults() { return results; }
|
实现再次搜索,展示功能
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
| private void searchActionPerformed(java.awt.event.ActionEvent evt) { String text = input.getText(); Input.read(text); input.setText(Input.getText()); List<ResultEntry> results = Input.getResults();
Page.page=1;
nowPage.setText(Page.page+""); if(results.size()!=0){ content1.setText(null); content2.setText(null); content3.setText(null); content4.setText(null); content5.setText(null); content1.setText(results.get(0).getText()); content2.setText(results.get(1).getText()); content3.setText(results.get(2).getText()); content4.setText(results.get(3).getText()); content5.setText(results.get(4).getText()); } }
|
实现页面分类检索,上下页翻动功能
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
| private void prePageActionPerformed(java.awt.event.ActionEvent evt) { if(Page.page==1){
}else { List<ResultEntry> results; Search search = new EsSearch(); Page.page--; results = search.search(Input.getText(),(Page.page+1)/2); for(ResultEntry x:results) { System.out.println(x); } if(results.get(0)==null){ results = search.search(Input.getText(),(Page.page+2)/2); Page.page++; }else{
} nowPage.setText(Page.page+""); if(results.size()!=0){ content1.setText(null); content2.setText(null); content3.setText(null); content4.setText(null); content5.setText(null); if(Page.page%2==1){ content1.setText(results.get(0).getText()); content2.setText(results.get(1).getText()); content3.setText(results.get(2).getText()); content4.setText(results.get(3).getText()); content5.setText(results.get(4).getText()); }else { content1.setText(results.get(5).getText()); content2.setText(results.get(6).getText()); content3.setText(results.get(7).getText()); content4.setText(results.get(8).getText()); content5.setText(results.get(9).getText()); } }
search.close(); }
}
private void nextPageActionPerformed(java.awt.event.ActionEvent evt) { Page.page++; List<ResultEntry> results; Search search = new EsSearch(); results = search.search(Input.getText(),(Page.page+1)/2);
if(results.size()==0){ results = search.search(Input.getText(),(Page.page)/2); Page.page--; }else{
}
nowPage.setText(Page.page+""); if(results.size()!=0){ content1.setText(null); content2.setText(null); content3.setText(null); content4.setText(null); content5.setText(null); if(Page.page%2==1){ content1.setText(results.get(0).getText()); content2.setText(results.get(1).getText()); content3.setText(results.get(2).getText()); content4.setText(results.get(3).getText()); content5.setText(results.get(4).getText()); }else {
content1.setText(results.get(5).getText()); content2.setText(results.get(6).getText()); content3.setText(results.get(7).getText()); content4.setText(results.get(8).getText()); content5.setText(results.get(9).getText());
} }
nowPage.setText(Page.page+""); search.close(); }
|
实现超链接跳转功能,展示url1的具体实现,url2-5同理
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
| private void url1ActionPerformed(java.awt.event.ActionEvent evt) { List<ResultEntry> results; Search search = new EsSearch(); results = search.search(Input.getText(),(Page.page+1)/2); Desktop desktop = Desktop.getDesktop(); URI uri=null; if(results.size()!=0) { if ((Page.page % 2) == 1) {
try { uri = new URI(results.get(0).getUrl()); } catch (URISyntaxException e) { throw new RuntimeException(e); } } else { if(results.size()>5) { try { uri = new URI(results.get(5).getUrl()); } catch (URISyntaxException e) { throw new RuntimeException(e); } } }
try { desktop.browse(uri); } catch (IOException e) { throw new RuntimeException(e); }
} search.close(); }
|
10、项目代码扫描结果及改正。
扫描结果

改正
if语句没加大括号

缺少注释者信息

缺少方法描述

11、项目总结(包括不足与展望、想要进一步完成的任务)
这次Java课程团队课程设计,团队学到了JavaEE规范,编写简单爬虫,大数据处理,Web相关等Java应用技术。可惜计划赶不上变化,团队变成小羊村让整体项目停摆了一段时间,还因为有一些软件选择的版本太新,没选择稳定的版本,导致网上的示例太少,出现bug也很难找到解决方法。项目的完成度相比于18级学长还是差了很多
目前项目的爬虫的抓取策略仍有改进之处,Elasticsearch的后台的搜索没有利用到评分系统,搜索提示不是很好用,Web前端展示只进行了初步的美化,GUI功能略有不足。
未来项目可以实现前后端分离,保证合法输入,增加时间区间选择搜索,美化Web界面对手机进行更好的自适应。希望将来有学弟学妹继续这个项目可以做的更好