Sunday, January 3, 2010

Hubble.net Demo configuration document

Step 1 Install Hubble.net


See this link: Installation of Hubble.net

Step 2 Login Hubble.net


Run QueryAnalyzer and input IP address point to the hubble.net service. Click Login then next step.

image

 

image

 

Step 3 Create database

Click File->Open
Choose CreateDatabase.sql then create News database

 

image

 

The sql for creating database is

 

exec sp_adddatabase 'News', 'd:\test\news\', 'SQLSERVER2005', 'Data 
Source=(local);Initial Catalog=News;Integrated Security=True'
;



 



The first parameter is the database name that refers to the Hubble.net. Here named News.

The second parameter is the default full-text index path of the database. Here is the “d: \ test \ news \”.


The third parameter is the default database adapter name


Here designated as SQLSERVER2005.  SQLSERVER 2005 database adapter can support SQLSERVER 2005 and later versions.



The fourth parameter is the default connection string, this string is used to connect relational database via Hubble.net.





Note: Before you create a database in hubble.net, you must create a database in SQLSERVER firstly. If you want to use other database in SQLSERVER, you only need to modify the connection string.





Click Excute



 



image



 



Choose the server node in the left, click right button and click Refresh. You can see the news database in the combobox.



 



Step 4 create table



 



 



Click File->Open then open CreateTableEng.sql,Choose news database at the combobox in up-left, click Excute。



 



image



exec sp_droptable 'news';
Create table News
(
Title nvarchar(max) Tokenized Analyzer 'EnglishAnalyzer' NOT NULL Default '',
Content nvarchar(max) Tokenized Analyzer 'EnglishAnalyzer' NOT NULL Default '',
Time Date Untokenized NOT NULL Default '1990-01-01',
Url nvarchar(max) NULL
);





The first line is to delete the news data table. If the news data table does not exist, this command will do nothing. The second part of command is to build the table. CREATE TABLE sql command like TSQL very much. You should specify the analyzer for the full-text field. Here is EnglishAnalyzer that is installed in hubble.net by default.  If you want to use other analyzer, you can implement the analyzer interface and install it in hubble.net.




Step 5 Import test data for news



 



File->Batch Import choose EnglishNews.sql.

You can download EnglishNews.slq on





http://hubbledotnet.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=36585



 



image



 



Click Import then batch import news data。



 



Execute flowing sql after import finished to optimize the index.

exec SP_OptimizeTable 'News'



image



 



Test following sql:

select top 10 * from news where content match 'hello' order by score desc



 



image















 



Web configuration



<connectionStrings>
<add
name="News"
connectionString="Data Source=127.0.0.1;Initial Catalog=News;"
providerName="Hubble.SQLClient"
/>
</connectionStrings>



You can set the connect string for hubble.net at this section.





The Index.CacheTimeout in App_Code/Index.cs can set the data cache timeout for SQLClient.


Default is 0.


-1: There is no cache


0: Has data-cache, but the timeout is 0. This case, for each query, SQLClient will ask Hubble.net service about the table changing. If the table has not changed, read from the data cache, otherwise read from server.


N: Cache N seconds,  ask Hubble.net service about the table changing after time-out.



Installation of Hubble.net

· Run installation file


Choose one of them:
x86/setup.exe
x64/setup.exe
IA/setup.exe

· Welcome

image
Click Next

· Register

image
The purpose of registration is only for statistic.
You can go to the following link to register
http://www.hubbledotnet.com/key.aspx


After filling out all of the information, click the Submit button, your installation key will be sent to the email address you entered. If you have not received, please check the spam email.
If it is not in the spam email box also, you can register again and the website will send the key again to you.
If you have difficulties to access this page, you can send the following information to the Hubble.net@gmail.com, we will help you register.
Email address you used to register
Your country
Your name, you can use nickname.

input email address and key then click Next

· Choose the installation folder

image
Click Next and finish installation.

Thursday, December 31, 2009

Introduction of Hubble.net

 

Homepage at codeplex

 

Abstract

Hubble.net is based on .Net framework of the open-source free full-text search database components. Open-source license is Apache 2.0. Hubble.net provides a SQL-based full-text search interface, users only need to operate SQL, can quickly learn how to use Hubble.net full-text search. Hubble.net can achieve full-text indexing and querying, multi-field searching and sorting, grouping statistics, distinct, classification, clustering, multi-table queries, and other a series of full-text search and data mining features. Hubble.net provides a database adapter interface and can be the perfect integration of various databases for data mining and full-text search features. Hubble.net designed a more perfect concurrency control procedures, data additions, deletions, updating or query can be executed as multi-threading without any conflict. Hubble.net also providers the cache and memory management is designed to help users to maximize the improvement of query efficiency. In the next few years, I think hubble.net will be the most popular full-text search component in .Net development environment.

 

 

Physical view

 

image

 

 

Hubble.net integrates full-text search and relational databases together and can do full-text searching for the database through the SQL. Hubble.net component itself is responsible for the full text inverted index, and the index stored in the directory specified by the table is defined in hubble.net. Relational database is responsible for the data storage.

 

 

logistic view

 

image

 

Hubble.net has the concept of databases and tables like relational databases. The databases and tables of hubble.net is a mapping to the relational databases. There are not databases or tables entities in hubble.net. When the full-text search query is sent to hubble.net via SQL, hubble.net will link to the database entities of relational databases automatically. Hubble.net likes a database from the user side view.

Hubble.net responsible for establishing the text field inverted index and the index of key-valued for Untokenized fields. Relational database is responsible for B+ tree index. If the query does not include full text search field, then forwarded directly to the database using the database indexes to search.

 

 

Three level cache

 

 

image

 

Three levels of cache is designed by Hubble.net.

Index cache: index-level cache for caching inverted index and key-valued index. This cache is automatically managed and can not close. Index-level cache will be synchronized automatically when the data are deleted, updated or inserted.

Query cache: query-level cache the terms of the query cache. Hubble.net system service will cache the document ids of different query result. When the same query is executed again, the query cache will work. As the table changes, the query-level cache will be expired if timeout is zero and need to re-cache.

Data cache: data-level cache is running on the sqlclient. The sqlclient will cache the data and read data from data-level cache next time. from the data cache to obtain the data directly, rather than then Hubble.net system services to obtain data. As the table changes, the data-level cache will be expired if timeout is zero and need to re-cache.

 

Memory management

Hubble.net is run as system services. It doesn’t share memory with application like Lucene. Hubble.net designed a set of memory management mechanism, the user can set the maximum amount of memory usage. Once Hubble.net using memory beyond this amount, it will start the memory cleaning process automatically, some of the less frequently cache from memory clean out in order to free up more memory space to the user. Users can view and manage memory though SP_CONFIGURE stored procedure.