Topics

Topic: 3D file formats

•  3ds (3D Studio files) 
•  ac (AC3D files) 
•  dae (Sony COLLADA files) 
•  dwg (Autodesk DWG files) 
•  dxf (Autodesk DXF files) 
•  igs (IGES files) 
•  iv (SGI  Inventor files) 
•  lwo (Lightwave object files) 
•  md2 ( Quake MD2 files) 
•  md3 (Quake MD3 files) 
•  obj (Wavefront Object files) 
•  off (Object File Format - ref.  Geomview at U. Minnesota Science & Technology Center) 
•  ply (Ply format files - ref. Cyberware, Stanford,  GaTech,  P. Bourke) 
•  pov (POV ray tracer files) 
•  stl (Stereolithography files) (suffix '.stlb' is sometimes used for binary STL files) 
•  stp (STEP files) 
•  vrml or wrl (VRML1 files) (a subset of the SGI Inventor file format) 
•  vtk (Visual Tool Kit files) 
•  wrl (VRML2 = VRM97 files) (common file suffixes: '.wrl', '.wrl.gz', '.wrz') 


Topic: 3D file MIME types

Published mime indicators for 3D files : 
◦ .3ds  - ? 
◦ .blend - ? 
◦ .dwf  - drawing/x-dwf  -OR-  model/vnd.dwf 
◦ .dwg  - application/acad  -OR-  image/x-dwg  -OR-  image/vnd.dwg 
◦ .dxf  - application/dxf  -OR-  image/x-dwg  -OR-  image/vnd.dwg 
◦ .igs  - application/iges  -OR-  model/iges 
◦ .iv   - application/x-inventor 
◦ .md2  - ? 
◦ .md3  - ? 
◦ .off  - ? 
◦ .obj  - ? 
◦ .ply  - ? 
◦ .pov  - model/x-pov 
◦ .stl  - application/sla 
◦ .stp  - application/step 
◦ .vrml - application/x-vrml  -OR-  model/vrml  -OR-  x-world/x-vrml 
◦ .vtk  - ? 
◦ .wrl  - application/x-world  -OR-  model/vrml  -OR-  x-world/x-vrml 
◦ .wrz  - x-world/x-vrml OR model/vrml 

Note that even though there do not seem to be sanctioned mime-type indicators for '.obj' and '.ply' files, 
my attempt at using 'application/wobj' and 'application/ply' proved to be successful


Topic: setting 3D MIME types in web.config

<configuration>
  <system.webServer>
    <staticContent>
      <remove fileExtension=".obj" />
      <mimeMap fileExtension=".obj" mimeType="application/wobj" />
      <remove fileExtension=".ply" />
      <mimeMap fileExtension=".ply" mimeType="application/ply" />
      <remove fileExtension=".stl" />
      <mimeMap fileExtension=".stl" mimeType="application/sla" />
      <remove fileExtension=".vrml" />
      <mimeMap fileExtension=".vrml" mimeType="application/x-vrml" />
      <remove fileExtension=".vtk" />
      <mimeMap fileExtension=".vtk" mimeType="application/vtk" />
    </staticContent>
	...


Topic: threejs_sample

Show >>


Topic: button styles

                  

 Button class= Description   

	  Default   	btn 	Standard gray button with gradient  

	  Primary   	btn btn-primary 	Provides extra visual weight and identifies the primary action in a set of buttons  

	  Info   	btn btn-info 	Used as an alternative to the default styles  

	  Success   	btn btn-success 	Indicates a successful or positive action  

	  Warning   	btn btn-warning 	Indicates caution should be taken with this action  

	  Danger   	btn btn-danger 	Indicates a dangerous or potentially negative action  

	  Inverse   	btn btn-inverse 	Alternate dark gray button, not tied to a semantic action or use  

	  Link   	btn btn-link 	Deemphasize a button by making it look like a link while maintaining button behavior     
        
See More >>


Topic: AssemblyInfo.cs

GlobalAssemblyInfo.cs:

 [assembly: AssemblyProduct("Your Product Name")]

 [assembly: AssemblyCompany("Your Company")]
 [assembly: AssemblyCopyright("Copyright © 2008 ...")]
 [assembly: AssemblyTrademark("Your Trademark - if applicable")]

 #if DEBUG
 [assembly: AssemblyConfiguration("Debug")]
 #else
 [assembly: AssemblyConfiguration("Release")]
 #endif

 [assembly: AssemblyVersion("This is set by build process")]
 [assembly: AssemblyFileVersion("This is set by build process")]


Local AssemblyInfo.cs:

 [assembly: AssemblyTitle("Your assembly title")]
 [assembly: AssemblyDescription("Your assembly description")]
 [assembly: AssemblyCulture("The culture - if not neutral")]

 [assembly: ComVisible(true/false)]

 // unique id per assembly
 [assembly: Guid("xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx")]

 // Automatically increased assembly version
 [assembly: AssemblyVersion("1.0.*")]


Topic: async_await

When using async and await the compiler generates a state machine in the background.

Here's an example on which I hope I can explain some of the high-level details that are going on:

public async Task MyMethodAsync()
{
    Task<int> longRunningTask = LongRunningOperationAsync();
    // independent work which doesn't need the result of LongRunningOperationAsync can be done here

    //and now we call await on the task 
    int result = await longRunningTask;
    //use the result 
    Console.WriteLine(result);
}

public async Task<int> LongRunningOperationAsync() // assume we return an int from this long running operation 
{
    await Task.Delay(1000); //1 seconds delay
    return 1;
}

OK, so what happens here:

    Task<int> longRunningTask = LongRunningOperationAsync(); starts executing LongRunningOperation

    Independent work is done on let's assume the Main Thread (Thread ID = 1) then await longRunningTask is reached.

    Now, if the longRunningTask hasn't finished and it is still running, MyMethodAsync() will return to its calling method, thus the main thread doesn't get blocked. When the longRunningTask is done then a thread from the ThreadPool (can be any thread) will return to MyMethodAsync() in its previous context and continue execution (in this case printing the result to the console).

A second case would be that the longRunningTask has already finished its execution and the result is available. When reaching the await longRunningTask we already have the result so the code will continue executing on the very same thread. (in this case printing result to console). Of course this is not the case for the above example, where there's a Task.Delay(1000) involved.

=================

The following Windows Forms example illustrates the use of await in an async method, WaitAsynchronouslyAsync. Contrast the behavior of that method with the behavior of WaitSynchronously. Without an await operator applied to a task, WaitSynchronously runs synchronously despite the use of the async modifier in its definition and a call to Thread.Sleep in its body.

private async void button1_Click(object sender, EventArgs e)
{
    // Call the method that runs asynchronously.
    string result = await WaitAsynchronouslyAsync();

    // Call the method that runs synchronously.
    //string result = await WaitSynchronously ();

    // Display the result.
    textBox1.Text += result;
}

// The following method runs asynchronously. The UI thread is not
// blocked during the delay. You can move or resize the Form1 window 
// while Task.Delay is running.
public async Task<string> WaitAsynchronouslyAsync()
{
    await Task.Delay(10000);
    return "Finished";
}

// The following method runs synchronously, despite the use of async.
// You cannot move or resize the Form1 window while Thread.Sleep
// is running because the UI thread is blocked.
public async Task<string> WaitSynchronously()
{
    // Add a using directive for System.Threading.
    Thread.Sleep(10000);
    return "Finished";
}


Topic: HashSet


Topic: Multidimensional Arrays and Jagged Arrays

# Multidimensional Arrays

string[,] Tablero = new string[,]
{
    {"1.1","1.2", "1.3"},
    {"2.1","2.2", "2.3"},
    {"3.1", "3.2", "3.3"}
};

int[, ,] array1 = new int[4, 2, 3];



# Jagged Arrays

string[][] Tablero = new string[][]
{
    new string[] {"1.1","1.2", "1.3"},
    new string[] {"2.1","2.2", "2.3"},
    new string[] {"3.1", "3.2", "3.3"}
};

A jagged array is an array whose elements are arrays. The elements of a jagged array can be of different dimensions and sizes. A jagged array is sometimes called an "array of arrays." 


Topic: Mutliline in Razor

http://stackoverflow.com/questions/8794906/textboxfor-mulitline

You could use a TextAreaFor helper:

@Html.TextAreaFor(
    model => model.Headline, 
    new { style = "width: 400px; height: 200px;" }
)but a much better solution is to decorate your Headline view model property with the [DataType] attribute specifying that you want it to render as a <textarea>:

public class MyViewModel
{
    [DataType(DataType.MultilineText)]
    public string Headline { get; set; }

    ...
}and then use the EditorFor helper:

<div class="headline">
    @Html.EditorFor(model => model.Headline)
</div>and finally in your CSS file specify its styling:

div.headline {
    width: 400px;
    height: 200px;
}Now you have a proper separation of concerns.


Topic: Regular expressions quick reference

Show >>


Topic: type alias

C# Type  .NET Framework Type  
bool 
 System.Boolean 
 
byte 
 System.Byte 
 
sbyte 
 System.SByte 
 
char 
 System.Char 
 
decimal 
 System.Decimal 
 
double 
 System.Double 
 
float 
 System.Single 
 
int 
 System.Int32 
 
uint 
 System.UInt32 
 
long 
 System.Int64 
 
ulong 
 System.UInt64 
 
object 
 System.Object 
 
short 
 System.Int16 
 
ushort 
 System.UInt16 
 
string 
 System.String 
 


Topic: caffe layters

Show >>


Topic: Resources

Caffe on Windows
https://github.com/Microsoft/caffe

Caffe on Windows Tutorials
http://www.cnblogs.com/yixuan-xu/p/5858595.html
http://www.cnblogs.com/yixuan-xu/p/5862657.html

Pre-trained Models from BVLC
http://dl.caffe.berkeleyvision.org/

Model Zoo
https://github.com/BVLC/caffe/wiki/Model-Zoo

CNN
http://www.cnblogs.com/52machinelearning/p/5821591.html

RCNN - Regions with CNN features
http://www.cnblogs.com/louyihang-loves-baiyan/p/4839869.html

Pre-trained Faster-RCNN model
https://people.eecs.berkeley.edu/~rbg/faster-rcnn-data/

FCN - Fully Convolutional Networks for Semantic Segmentation
https://github.com/shelhamer/fcn.berkeleyvision.org

Use Cifar10 for two-label classification
http://stackoverflow.com/questions/31972448/how-to-understand-the-cifar10-prediction-output

Use FCN on caffe python
http://stackoverflow.com/questions/40174122/how-to-test-fcnvoc-fcn8s-on-caffe-python


Topic: SGD and GD

In both gradient descent (GD) and stochastic gradient descent (SGD), you update a set of parameters in an iterative manner to minimize an error function.

While in GD, you have to run through ALL the samples in your training set to do a single update for a parameter in a particular iteration, in SGD, on the other hand, you use ONLY ONE or SUBSET of training sample from your training set to do the update for a parameter in a particular iteration. If you use SUBSET, it is called Minibatch Stochastic gradient Descent.

Thus, if the number of training samples are large, in fact very large, then using gradient descent may take too long because in every iteration when you are updating the values of the parameters, you are running through the complete training set. On the other hand, using SGD will be faster because you use only one training sample and it starts improving itself right away from the first sample.

SGD often converges much faster compared to GD but the error function is not as well minimized as in the case of GD. Often in most cases, the close approximation that you get in SGD for the parameter values are enough because they reach the optimal values and keep oscillating there.


Topic: solver types

Show >>


Topic: solver.prototxt file

The solver.prototxt is a configuration file used to tell caffe how you want the network trained.

Parameters

> base_lr

This parameter indicates the base (beginning) learning rate of the network. The value is a real number (floating point).

> lr_policy

This parameter indicates how the learning rate should change over time. This value is a quoted string.

Options include:

    "step" - drop the learning rate in step sizes indicated by the gamma parameter.
    "multistep" - drop the learning rate in step size indicated by the gamma at each specified stepvalue.
    "fixed" - the learning rate does not change.
    "exp" - gamma^iteration
    "poly" -
    "sigmoid" -

> gamma

This parameter indicates how much the learning rate should change every time we reach the next "step." The value is a real number, and can be thought of as multiplying the current learning rate by said number to gain a new learning rate.

> stepsize

This parameter indicates how often (at some iteration count) that we should move onto the next "step" of training. This value is a positive integer.

> stepvalue

This parameter indicates one of potentially many iteration counts that we should move onto the next "step" of training. This value is a positive integer. There are often more than one of these parameters present, each one indicated the next step iteration.

> max_iter

This parameter indicates when the network should stop training. The value is an integer indicate which iteration should be the last.

> momentum

This parameter indicates how much of the previous weight will be retained in the new calculation. This value is a real fraction.

> weight_decay

This parameter indicates the factor of (regularization) penalization of large weights. This value is a often a real fraction.

> solver_mode

This parameter indicates which mode will be used in solving the network.

Options include:

    CPU
    GPU

> snapshot

This parameter indicates how often caffe should output a model and solverstate. This value is a positive integer.
snapshot_prefix:

This parameter indicates how a snapshot output's model and solverstate's name should be prefixed. This value is a double quoted string.

> net:

This parameter indicates the location of the network to be trained (path to prototxt). This value is a double quoted string.

> test_iter

This parameter indicates how many test iterations should occur per test_interval. This value is a positive integer.

> test_interval

This parameter indicates how often the test phase of the network will be executed.

> display

This parameter indicates how often caffe should output results to the screen. This value is a positive integer and specifies an iteration count.

> type

This parameter indicates the back propagation algorithm used to train the network. This value is a quoted string.

Options include:

    Stochastic Gradient Descent "SGD"
    AdaDelta "AdaDelta"
    Adaptive Gradient "AdaGrad"
    Adam "Adam"
    Nesterov’s Accelerated Gradient "Nesterov"
    RMSprop "RMSProp"


Topic: weight_decay decay_multi

=== weight_decay ===

The weight_decay meta parameter govern the regularization term of the neural net.

During training a regularization term is added to the network's loss to compute the backprop gradient. The weight_decay value determines how dominant this regularization term will be in the gradient computation.

As a rule of thumb, the more training examples you have, the weaker this term should be. The more parameters you have (i.e., deeper net, larger filters, larger InnerProduct layers etc.) the higher this term should be.

Caffe also allows you to choose between L2 regularization (default, sum of weight squares) and L1 regularization (sum of weight absolute values), by setting

regularization_type: "L1"

However, since in most cases weights are small numbers (i.e., -1<w<1), the L2 norm of the weights is significantly smaller than their L1 norm. Thus, if you choose to use regularization_type: "L1" you might need to tune weight_decay to a significantly smaller value.

(There is also a L0 norm, which corresponds to the total number of zero-valued weights. L0 and L1 tends to get a sparse solution, i.e. only keep principle features).

While learning rate may (and usually does) change during training, the regularization weight is fixed throughout.


=== decay_multi ===

In the solver file, we can set a global regularization loss using the weight_decay and regularization_type options.

In many cases we want different weight decay rates for different layers. This can be done by setting the decay_mult option for each layer in the network definition file, where decay_mult is the multiplier on the global weight decay rate, so the actual weight decay rate applied for one layer is decay_mult*weight_decay.

For example, the following defines a convolutional layer with NO weight decay regardless of the options in the solver file.

layer {
  name: "Convolution1"
  type: "Convolution"
  bottom: "data"
  top: "Convolution1"
  param {
    decay_mult: 0
  }
  convolution_param {
    num_output: 32
    pad: 0
    kernel_size: 3
    stride: 1
    weight_filler {
      type: "xavier"
    }
  }
}



Topic: AttachDbWithOnlyMdf

sp_attach_single_file_db @dbname='DOCTOR_20130922',@physname='D:\CDSS-root\ShanXi\DB\MIAS_DB_20130922.mdf'


Topic: Clean Log File

SQL Server requires a transaction log in order to function.

That said there are two modes of operation for the transaction log:

Simple
Full

In Full mode the transaction log keeps growing until you back up the database. In Simple mode: space in the transaction log is 'recycled' every Checkpoint.

Very few people have a need to run their databases in the Full recovery model. The only point in using the Full model is if you want to backup the database multiple times per day, and backing up the whole database takes too long - so you just backup the transaction log.

The transaction log keeps growing all day, and you keep backing just it up. That night you do your full backup, and SQL Server then truncates the transaction log, begins to reuse the space allocated in the transaction log file.

If you only ever do full database backups, you don't want the Full recovery mode.

ALTER DATABASE testdb SET RECOVERY SIMPLE;


use this to free up log file:

USE yourdb;
GO
CHECKPOINT;
GO
CHECKPOINT; -- run twice to ensure file wrap-around
GO
DBCC SHRINKFILE(yourdb_log, 200); -- MB
GO


Topic: data import using sqlcmd

> sqlcmd -S .\SQLExpress -i C:\Exchange\ftp\publish\Script.sql


Topic: Database Connection Pooling

Database connection pooling is a method used to keep database connections open so they can be reused by others.

Typically, opening a database connection is an expensive operation, so pooling keeps the connections active so that, when a connection is later requested, one of the active ones is used in preference to opening another one.

In it's simplest form, it's just a similar API call to the real open-connection API which first checks the pool for a suitable connection. If one is available, that's given to the client, otherwise a new one is created.

Similarly, there's a close API call which doesn't actually call the real close-connection, rather it puts the connection into the pool for later use.

That's a pretty simplistic explanation. Real implementations may be able to handle connections to multiple servers, may pre-allocate some baseline of connections so some are ready immediately, may actually close old connections when the usage pattern quietens down, and so on.

So, something like the following:

  +---------+
  |         |
  | Clients |
+---------+ |
|         |-+        +------+          +----------+
| Clients | ===#===> | Open | =======> | RealOpen |
|         |    |     +------+          +----------+
+---------+    |         ^
               |         |
               |     /------\
               |     | Pool |
               |     \------/
               |         ^
               |         |
               |     +-------+         +-----------+
               #===> | Close | ======> | RealClose |
                     +-------+         +-----------+


Topic: ManipulateSQLServerWithoutManagementStudio

Open "SQL Server Configuration Manager"

Enable "Named Pipelines" protocol

Create an alias:

      Alias Name: oo
      Protocol: Named Pipes
      Server: .\SQLExpress

cmd -> sqlcmd -S oo


Topic: OracleImport


Topic: RefreshSMSSIntellisense

After modifying a table structure, query editor doesn't recognize the change.
In this case, press "CTRL+SHIFT+R" to refresh intellisense.


Topic: SQL Server Indexes

                 

  
Articles  
Blog  
About    

 SQL Server Indexes 

 Thursday, January 01, 2004   Relational databases like SQLServeruse indexes to find data quickly when a query is processed. Creating and removing indexes from a database schema will rarely result in changes to an application's code; indexes operate 'behind the scenes' in support of the database engine. However, creating the proper index can drastically increase the performance of an application. 

The SQL Server engine uses an index in much the same way a reader uses a book index. For example, one way to find all references to INSERT statements in a SQL book would be to begin on page one and scan each page of the book. We could mark each time we find the word INSERT until we reach the end of the book. This approach is pretty time consuming and laborious. Alternately, we can also use the index in the back of the book to find a page number for each occurrence of the INSERT statements. This approach produces the same results as above, but with tremendous savings in time.  

When aSQL Serverhas no index to use for searching, the result is similar to the reader who looks at every page in a book to find a word: theSQL engine needs to visit every row in a table. In database terminology we call this behavior a table scan, or just scan.  

A table scan is not always a problem, and is sometimes unavoidable. However, as a table grows to thousands of rows and then millions of rows and beyond, scans become correspondingly slower and more expensive. 

Consider the following query on the Products table of the Northwind database. This query retrieves products in a specific price range. 


SELECT ProductID, ProductName, UnitPrice FROM Products WHERE (UnitPrice > 12.5) AND (UnitPrice < 14)  


There is currently no index on the Product table to help this query, so the database engine performs a scan and examines each record to see if UnitPrice falls between 12.5 and 14. In the diagram below, the database search touches a total of 77 records to find just three matches. 

Now imagine if we created an index, just like a book index, on the data in the UnitPrice column. Each index entry would contain a copy of the UnitPrice value for a row, and a reference (just like a page number) to the row where the value originated. SQL will sort these index entries into ascending order. The index will allow the database to quickly narrow in on the three rows to satisfy the query, and avoid scanning every row in the table.  Create An Index 

Having a data connection in the Server Explorer view of Visual Studio.NET allows us to easily create new indexes:  
Navigate to the Products table of the Northwind database.  
Right click the table and select Design Table from the context menu.  
With the design screen in focus, click the Indexes/Keys item on the View menu of the IDE.  

This should bring you to the following tabbed dialog box. 

The dialog is currently displaying an existing index on the Products table: the PK_Products index. We will see later in this chapter how primary key fields are automatically indexed to enforce uniqueness in the key values.  
In the above dialog click on the New button, and in the Index name text box, replace the existing entry with IDX_UnitPrice.  
Beneath the text box is a control where we set the columns to index. Pull down the entry with ProductID and select the UnitPrice column instead. 
Leave all of the other options with default settings. 
Close the dialog and the table design view, making sure to save all of the changes when prompted to do so. The IDE will then issue the commands to create the new index. 

We can create the same index using the following SQL. The command specifies the name of the index (IDX_UnitPrice), the table name (Products), and the column to index (UnitPrice). 


 CREATE INDEX [IDX_UnitPrice] ON Products (UnitPrice)  


To verify that the index is created, use the following stored procedure to see a list of all indexes on the Products table: 


 EXEC sp_helpindex Customers  

 How It Works 

The database takes the columns specified in a CREATE INDEX command and sorts the values into a special data structure known as a B-tree. A B-tree structure supports fast searches with a minimum amount of disk reads, allowing the database engine to quickly find the starting and stopping points for the query we are using. 

Conceptually, we may think of an index as shown in the diagram below. On the left, each index entry contains the index key (UnitPrice). Each entry also includes a reference (which points) to the table rows which share that particular value and from which we can retrieve the required information. 

Much like the index in the back of a book helps us to find keywords quickly, so the database is able to quickly narrow the number of records it must examine to a minimum by using the sorted list of UnitPrice values stored in the index. We have avoided a table scan to fetch the query results. Given this sketch of how indexes work, lets examine some of the scenarios where indexes offer a benefit. Taking Advantage of Indexes 

The database engine can use indexes to boost performance in a number of different queries. Sometimes these performance improvements are dramatic. An important feature of SQL Server 2000 is a component known as the query optimizer. The query optimizer's job is to find the fastest and least resource intensive means of executing incoming queries. An important part of this job is selecting the best index or indexes to perform the task. In the following sections we will examine the types of queries with the best chance of benefiting from an index. Searching For Records 

The most obvious use for an index is in finding a record or set of records matching a WHERE clause. Indexes can aid queries looking for values inside of a range (as we demonstrated earlier), as well as queries looking for a specific value. By way of example, the following queries can all benefit from an index on UnitPrice:  


DELETE FROM Products WHERE UnitPrice = 1 UPDATE Products SET Discontinued = 1 WHERE UnitPrice > 15 SELECT * FROM PRODUCTS WHERE UnitPrice BETWEEN 14 AND 16  


Indexes work just as well when searching for a record in DELETE and UPDATE commands as they do for SELECT statements.  Sorting Records 

When we ask for a sorted dataset, the database will try to find an index and avoid sorting the results during execution of the query. We control sorting of a dataset by specifying a field, or fields, in an ORDER BY clause, with the sort order as ASC (ascending) or DESC (descending). For example, the following query returns all products sorted by price: 


 SELECT * FROM Products ORDER BY UnitPrice ASC  

With no index, the database will scan the Products table and sort the rows to process the query. However, the index we created on UnitPrice (IDX_UnitPrice) earlier provides the database with a presorted list of prices. The database can simply scan the index from the first entry to the last entry and retrieve the rows in sorted order. 


The same index works equally well with the following query, simply by scanning the index in reverse. 


SELECT * FROM Products ORDER BY UnitPrice DESC  

 Grouping Records 

We can use a GROUP BY clause to group records and aggregate values, for example, counting the number of orders placed by a customer. To process a query with a GROUP BY clause, the database will often sort the results on the columns included in the GROUP BY. The following query counts the number of products at each price by grouping together records with the same UnitPrice value. 


SELECT Count(*), UnitPrice FROM Products GROUP BY UnitPrice  


The database can use the IDX_UnitPrice index to retrieve the prices in order. Since matching prices appear in consecutive index entries, the database is able count the number of products at each price quickly. Indexing a field used in a GROUP BY clause can often speed up a query.  Maintaining a Unique Column 

Columns requiring unique values (such as primary key columns) must have a unique index applied. There are several methods available to create a unique index. Marking a column as a primary key will automatically create a unique index on the column. We can also create a unique index by checking the Create UNIQUE checkbox in the dialog shown earlier. The screen shot of the dialog displayed the index used to enforce the primary key of the Products table. In this case, the Create UNIQUE checkbox is disabled, since an index to enforce a primary key must be a unique index. However, creating new indexes not used to enforce primary keys will allow us to select the Create UNIQUE checkbox. We can also create a unique index using SQL with the following command:  


CREATE UNIQUE INDEX IDX_ProductName On Products (ProductName)  


The above SQL command will not allow any duplicate values in the ProductName column, and an index is the best tool for the database to use to enforce this rule. Each time an application adds or modifies a row in the table, the database needs to search all existing records to ensure none of values in the new data duplicate existing values. Indexes, as we should know by now, will improve this search time. Index Drawbacks 

There are tradeoffs to almost any feature in computer programming, and indexes are no exception. While indexes provide a substantial performance benefit to searches, there is also a downside to indexing. Let's talk about some of those drawbacks now. Indexes and Disk SpaceIndexes are stored on the disk, and the amount of space required will depend on the size of the table, and the number and types of columns used in the index. Disk space is generally cheap enough to trade for application performance, particularly when a database serves a large number of users. To see the space required for a table, use the sp_spaceused system stored procedure in a query window. 

 EXEC sp_spaceused Orders  


Given a table name (Orders), the procedure will return the amount of space used by the data and all indexes associated with the table, like so:  


Name rows reserved data index_size unused ------- -------- ----------- ------ ---------- ------- Orders 830 504 KB 160 KB 320 KB 24 KB  


According to the output above, the table data uses 160 kilobytes, while the table indexes use twice as much, or 320 kilobytes. The ratio of index size to table size can vary greatly, depending on the columns, data types, and number of indexes on a table. Indexes and Data Modification 

Another downside to using an index is the performance implication on data modification statements. Any time a query modifies the data in a table (INSERT, UPDATE, or DELETE), the database needs to update all of the indexes where data has changed. As we discussed earlier, indexing can help the database during data modification statements by allowing the database to quickly locate the records to modify, however, we now caveat the discussion with the understanding that providing too many indexes to update can actually hurt the performance of data modifications. This leads to a delicate balancing act when tuning the database for performance. 

In decision support systems and data warehouses, where information is stored for reporting purposes, data remains relatively static and report generating queries outnumber data modification queries. In these types of environments, heavy indexing is commonplace in order to optimize the reports generated. In contrast, a database used for transaction processing will see many records added and updated. These types of databases will use fewer indexes to allow for higher throughput on inserts and updates. 

Every application is unique, and finding the best indexes to use for a specific application usually requires some help from the optimization tools offered by many database vendors. SQL Server 2000 and Access include the Profiler and Index Tuning Wizard tools to help tweak performance. 

Now we have enough information to understand why indexes are useful and where indexes are best applied. It is time now to look at the different options available when creating an index and then address some common rules of thumb to use when planning the indexes for your database. Clustered Indexes 

Earlier in the article we made an analogy between a database index and the index of a book. A book index stores words in order with a reference to the page numbers where the word is located. This type of index for a database is a nonclustered index; only the index key and a reference are stored. In contrast, a common analogy for a clustered index is a phone book. A phone book still sorts entries into alphabetical order. The difference is, once we find a name in a phone book, we have immediate access to the rest of the data for the name, such as the phone number and address. 

For a clustered index, the database will sort the table's records according to the column (or columns) specified by the index. A clustered index contains all of the data for a table in the index, sorted by the index key, just like a phone book is sorted by name and contains all of the information for the person inline. The nonclustered indexes created earlier in the chapter contain only the index key and a reference to find the data, which is more like a book index. You can only create one clustered index on each table. 

In the diagram below we have a search using a clustered index on the UnitPrice column of the Products table. Compare this diagram to the previous diagram with a regular index on UnitPrice. Although we are only showing three columns from the Products table, all of the columns are present and notice the rows are sorted into the order of the index, there is no reference to follow from the index back to the data. 

A clustered index is the most important index you can apply to a table. If the database engine can use a clustered index during a query, the database does not need to follow references back to the rest of\ the data, as happens with a nonclustered index. The result is less work for the database, and consequently, better performance for a query using a clustered index. 

To create a clustered index, simply select the Create As CLUSTERED checkbox in the dialog box we used at the beginning of the chapter. The SQL syntax for a clustered index simply adds a new keyword to the CREATE INDEX command, as shown below: 


 CREATE CLUSTERED INDEX IDX_SupplierID ON Products(SupplierID)  


Most of the tables in the Northwind database already have a clustered index defined on a table. Since we can only have one clustered index per table, and the Products table already has a clustered index (PK_Products) on the primary key (ProductId), the above command should generate the following error:  


Cannot create more than one clustered index on table 'Products'. Drop the existing clustered index 'PK_Products' before creating another.  


As a general rule of thumb, every table should have a clustered index. If you create only one index for a table, use a clustered index. Not only is a clustered index more efficient than other indexes for retrieval operations, a clustered index also helps the database efficiently manage the space required to store the table. In SQL Server, creating a primary key constraint will automatically create a clustered index (if none exists) using the primary key column as the index key.  

Sometimes it is better to use a unique nonclustered index on the primary key column, and place the clustered index on a column used by more queries. For example, if the majority of searches are for the price of a product instead of the primary key of a product, the clustered index could be more effective if used on the price field. A clustered index can also be a UNIQUE index.  A Disadvantage to Clustered Indexes 

If we update a record and change the value of an indexed column in a clustered index, the database might need to move the entire row into a new position to keep the rows in sorted order. This behavior essentially turns an update query into a DELETE followed by an INSERT, with an obvious decrease in performance. A table's clustered index can often be found on the primary key or a foreign key column, because key values generally do not change once a record is inserted into the database. Composite Indexes 

A composite index is an index on two or more columns. Both clustered and nonclustered indexes can be composite indexes. Composite indexes are especially useful in two different circumstances. First, you can use a composite index to cover a query. Secondly, you can use a composite index to help match the search criteria of specific queries. We will go onto more detail and give examples of these two areas in the following sections. Covering Queries with an Index 

Earlier in the article we discussed how an index, specifically a nonclustered index, contains only the key values and a reference to find the associated row of data. However, if the key value contains all of the information needed to process a query, the database never has to follow the reference and find the row; it can simply retrieve the information from the index and save processing time. This is always a benefit for clustered indexes. 

As an example, consider the index we created on the Products table for UnitPrice. The database copied the values from the UnitPrice column and sorted them into an index. If we execute the following query, the database can retrieve all of the information for the query from the index itself.  


SELECT UnitPrice FROM Products ORDER BY UnitPrice  


We call these types of queries covered queries, because all of the columns requested in the output are contained in the index itself. A clustered index, if selected for use by the query optimizer, always covers a query, since it contains all of the data in a table. 

For the following query, there are no covering indexes on the Products table. 


SELECT ProductName, UnitPrice FROM Products ORDER BY UnitPrice  


This is because although the database will use the index on UnitPrice to avoid sorting records, it will need to follow the reference in each index entry to find the associated row and retrieve the product name. By creating a composite index on two columns (ProductName and UnitPrice), we can cover this query with the new index. Matching Complex Search Criteria 

For another way to use composite indexes, let's take a look at the OrderDetails table of Northwind. There are two key values in the table (OrderID and ProductID); these are foreign keys, referencing the Orders and Products tables respectively. There is no column dedicated for use as a primary key; instead, the primary key is the combination of the columns OrderID and ProductID. 

The primary key constraint on these columns will generate a composite index, which is unique of course. The command the database would use to create the index looks something like the following:  


CREATE UNIQUE CLUSTERED INDEX PK_Order_Details ON [Order Details] (OrderID, ProductID)  


The order in which columns appear in a CREATE INDEX statement is significant. The primary sort order for this index is OrderID. When the OrderID is the same for two or more records, the database will sort this subset of records on ProductID. 

The order of columns determines how useful the index is for a query. Consider the phone book sorted by last name then first name. The phone book makes it easy to find all of the listings with a last name of Smith, or all of the listings with a last name of Jones and a first name of Lisa, but it is difficult to find all listings with a first name of Gary without scanning the book page by page. 

Likewise, the composite index on Order Details is useful in the following two queries: 


SELECT * FROM [Order Details] WHERE OrderID = 11077 SELECT * FROM [Order Details] WHERE OrderID = 11077 AND ProductID = 13  


However, the following query cannot take advantage of the index we created since ProductID is the second part of the index key, just like the first name field in a phone book. 


 SELECT * FROM [Order Details] WHERE ProductID = 13  


In this case, ProductID is a primary key, however, so an index does exist on the ProductID column for the database to use for this query. 

Suppose the following query is the most popular query executed by our application, and we decided we needed to tune the database to support it. 


SELECT ProductName, UnitPrice FROM Products ORDER BY UnitPrice  


We could create the following index to cover the query. Notice we have specified two columns for the index: UnitPrice and ProductName (making the index a composite index): 


 CREATE INDEX IX_UnitPrice_ProductName ON Products(UnitPrice, ProductName)  


While covered queries can provide a performance benefit, remember there is a price to pay for each index we add to a table, and we can also never cover every query in a non-trivial application. Additional Index Guidelines 

Choosing the correct columns and types for an index is another important step in creating an effective index. In this section, we will talk about two main points, namely short index keys and selective indexes (we'll explain what selective indexes are in just a moment). Keep Index Keys Short 

The larger an index key is, the harder a database has to work to use the index. For instance, an integer key is smaller in size then a character field for holding 100 characters. In particular, keep clustered indexes as short as possible. 

There are several approaches to keeping an index key short. First, try to limit the index to as few columns as possible. While composite indexes are useful and can sometimes optimize a query, they are also larger and cause more disk reads for the database. Secondly, try to choose a compact data type for an index column, based on the number of bytes required for each data type. Integer keys are small and easy for the database to compare. In contrast, strings require a character-by-character comparison. 

As a rule of thumb, try to avoid using character columns in an index, particularly primary key indexes. Integer columns will always have an advantage over character fields in ability to boost the performance of a query. Distinct Index Keys 

The most effective indexes are the indexes with a small percentage of duplicated values. Think of having a phone book for a city where 75% of the population has the last name of Smith. A phone book in this area might be easier to use if the entries were sorted by the resident's first names instead. A good index will allow the database to disregard as many records as possible during a search. 

An index with a high percentage of unique values is a selective index. Obviously, a unique index is the most selective index of all, because there are no duplicate values. SQL Server will track statistics for indexes and will know how selective each index is. The query optimizer utilizes these statistics when selecting the best index to use for a query. Maintaining Indexes 

In addition to creating an index, we'll need to view existing indexes, and sometimes delete or rename them. This is part of the ongoing maintenance cycle of a database as the schema changes, or even naming conventions change. View Existing Indexes 

A list of all indexes on a table is available in the dialog box we used to create an index. Click on the Selected index drop down control and scroll through the available indexes. 

There is also a stored procedure named sp_helpindex. This stored procedure gives all of the indexes for a table, along with all of the relevant attributes. The only input parameter to the procedure is the name of the table, as shown below. 


EXEC sp_helpindex Customers  

 Rename an Index 

We can also rename any user created object with the sp_rename stored procedure, including indexes. The sp_rename procedure takes, at a minimum, the current name of the object and the new name for the object. For indexes, the current name must include the name of the table, a dot separator, and the name of the index, as shown below:  


EXEC sp_rename 'Products.IX_UnitPrice', 'IX_Price'  


This will change the name of the IX_UnitPrice index to IX_Price. Delete an Index 

It is a good idea to remove an index from the database if the index is not providing any benefit. For instance, if we know the queries in an application are no longer searching for records on a particular column, we can remove the index. Unneeded indexes only take up storage space and diminish the performance of modifications. You can remove most indexes with the Delete button on the index dialog box, which we saw earlier. The equivalent SQL command is shown below. 


DROP Index Products.IX_Price  


Again, we need to use the name of the table and the name of the index, with a dot separator. Some indexes are not so easy to drop, namely any index supporting a unique or primary key constraint. For example, the following command tries to drop the PK_Products index of the Products table.  


DROP INDEX Products.PK_Products  


Since the database uses PK_Products to enforce a primary key constraint on the Products table, the above command should produce the following error. 


An explicit DROP INDEX is not allowed on index 'Products.PK_Products'. It is being used for PRIMARY KEY constraint enforcement.  


Removing a primary key constraint from a table is a redesign of the table, and requires careful thought. It makes sense to know the only way to achieve this task is to either drop the table and use a CREATE TABLE command to recreate the table without the index, or to use the ALTER TABLE command.  Conclusion 

In this article we learned how to create, manage, and select indexes for SQL Server tables. Most of what we covered is true for any relational database engine. Proper indexes are crucial for good performance in large databases. Sometimes you can make up for a poorly written query with a good index, but it can be hard to make up for poor indexing with even the best queries.    


   Follow  

   Subscribe  

   Contact  

 Search Archives    

by K. Scott Allen   


  (c) OdeToCode LLC 2004 - 2013     
        
See More >>


Topic: SQLite limts

SQL Server Expression Edition has a database size limit of 10GB, which doesn't suffice for all projects.

For SQLite, its limits is as follows:

- Maximum Number Of Rows In A Table
The theoretical maximum number of rows in a table is 264 (18446744073709551616 or about 1.8e+19). This limit is unreachable since the maximum database size of 140 terabytes will be reached first. A 140 terabytes database can hold no more than approximately 1e+13 rows, and then only if there are no indices and if each row contains very little data.

- Maximum Database Size
Every database consists of one or more "pages". Within a single database, every page is the same size, but different database can have page sizes that are powers of two between 512 and 65536, inclusive. The maximum size of a database file is 2147483646 pages. At the maximum page size of 65536 bytes, this translates into a maximum database size of approximately 1.4e+14 bytes (140 terabytes, or 128 tebibytes, or 140,000 gigabytes or 128,000 gibibytes).
This particular upper bound is untested since the developers do not have access to hardware capable of reaching this limit. However, tests do verify that SQLite behaves correctly and sanely when a database reaches the maximum file size of the underlying filesystem (which is usually much less than the maximum theoretical database size) and when a database is unable to grow due to disk space exhaustion. 


Topic: SQLServer Auto BackUp

Auto Backup Database Using Maintenance Plans
(Requires SQL Server non-Express edition)

1. Start Up SQL Server Agent Service (Set it to Run Automatically)
2. Go To SQL Server Management Studio, Find TAB Management > Maintenance Plans. 
   Right Click > Maintenance Plans Wizard.
3. Create a maintenance plan. 
   Select Option button : Single Schedule for entire task.
4. Configure: Full/Increamental Database backup; 
   Define which database to backup
   Set backup location, etc.
5. Click Next, Choose your report mode, then Finish.


For SQL Server Express Edition, use Windows Scheduled Tasks:

In the batch file
"C:\Program Files\Microsoft SQL Server\100\Tools\Binn\SQLCMD.EXE" -S 
(local)\SQLExpress -i D:\dbbackups\SQLExpressBackups.sql

In SQLExpressBackups.sql
BACKUP DATABASE MyDataBase1 TO  DISK = N'D:\DBbackups\MyDataBase1.bak' 
WITH NOFORMAT, INIT,  NAME = N'MyDataBase1 Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 10

BACKUP DATABASE MyDataBase2 TO  DISK = N'D:\DBbackups\MyDataBase2.bak' 
WITH NOFORMAT, INIT,  NAME = N'MyDataBase2 Backup', SKIP, NOREWIND, NOUNLOAD,  STATS = 10

GO


Topic: SQLServer Documentation Tool

=== Use sqldbdoc.exe command line tool to generate db document ===

C:\>sqldbdoc.exe "SERVER=.\SQLExpress;initial catalog=Rules;integrated security=True;multipleactiveresultsets=True;" c:\doc.htm

Altairis DB>doc version 1.0.0.0
Copyright (c) Altairis, 2011 | www.altairis.cz | SqlDbDoc.codeplex.com

Autodetecting output format...
Output format: html
dbo.sp_helpdiagrams
dbo.sp_helpdiagramdefinition
dbo.sp_creatediagram
dbo.sp_renamediagram
dbo.sp_alterdiagram
dbo.sp_dropdiagram
dbo.fn_diagramobjects
dbo.test
    id int
    name varchar
    dbo.PK_test
dbo.ClinicalProblemSet
    Id int
    Name nvarchar
    Description nvarchar
    ReferenceURL nvarchar
    Code nvarchar
    CodingSystem nvarchar
    dbo.PK_ClinicalProblemSet
dbo.EventSet
    Id int
    Name nvarchar
    Description nvarchar
    EventType nvarchar
    TimeStamp datetime
    Encounter_Id int
    dbo.PK_EventSet
    dbo.FK_EncounterEvent
dbo.InterventionSet
    Id int
    Name nvarchar
    ClinicalProblemIntervention_Intervention_Id int
    dbo.PK_InterventionSet
    dbo.FK_ClinicalProblemIntervention
dbo.PlanSet
    Id int
    Name nvarchar
    Duration nvarchar
    Objective nvarchar
    Cost nvarchar
    Criteria nvarchar
    InterventionPlan_Plan_Id int
    ClinicalProblemInstancePlan_Plan_Id int
    dbo.PK_PlanSet
    dbo.FK_InterventionPlan
    dbo.FK_ClinicalProblemInstancePlan
dbo.PhaseSet
    Id int
    Name nvarchar
    Duration nvarchar
    Period nvarchar
    PlanPhase_Phase_Id int
    dbo.PK_PhaseSet
    dbo.FK_PlanPhase
dbo.TaskSet
    Optional bit
    MultiSelect bit
    Id int
    Name nvarchar
    Code nvarchar
    CodingSystem nvarchar
    PhaseTask_Task_Id int
    dbo.PK_TaskSet
    dbo.FK_PhaseTask
dbo.MedicalOrderDefinition
    OrderType nvarchar
    AdministrationRoute nvarchar
    TemporalType nvarchar
    Frequency nvarchar
    Dosage nvarchar
    AdditionalInstruction nvarchar
    Id int
    Name nvarchar
    Code nvarchar
    CodingSystem nvarchar
    Description nvarchar
    TaskMedicalOrder_MedicalOrder_Id int
    dbo.PK_MedicalOrderSet
    dbo.FK_TaskMedicalOrder
dbo.ConceptSet
    Id int
    Code nvarchar
    CodingSystem nvarchar
    Description nvarchar
    Literal nvarchar
    Literal2 nvarchar
    Literal3 nvarchar
    Hierarchy nvarchar
    dbo.PK_ConceptSet
dbo.TriggerRuleSet
    Name nvarchar
    MajorVersion int
    MinorVersion int
    RuleSet nvarchar
    Status smallint
    AssemblyPath nvarchar
    ActivityName nvarchar
    ModifiedDate datetime
    ClinicalProblemTriggerRule_TriggerRule_Id int
    dbo.PK_TriggerRuleSet
    dbo.FK_ClinicalProblemTriggerRule
dbo.ClinicalProblemInstanceSet
    Id int
    State nvarchar
    Priority nvarchar
    ClinicalProblem_Id int
    Encounter_Id int
    TriggerRule_Name nvarchar
    TriggerRule_MajorVersion int
    TriggerRule_MinorVersion int
    dbo.PK_ClinicalProblemInstanceSet
    dbo.FK_ClinicalProblemClinicalProblemInstance
    dbo.FK_EncounterClinicalProblemInstance
    dbo.FK_ClinicalProblemInstanceTriggerRule
dbo.EvidenceSet
    Id int
    URL nvarchar
    EvidenceType nvarchar
    TimeStamp datetime
    Event_Id int
    dbo.PK_EvidenceSet
    dbo.FK_EventEvidence
dbo.LabTestSpecialtySet
    Id int
    Name nvarchar
    Comment nvarchar
    dbo.PK_LabTestSpecialtySet
dbo.LabTestSuiteSet
    Id int
    Name nvarchar
    Comment nvarchar
    LabTestSpecialtyLabTestSuite_LabTestSuite_Id int
    dbo.PK_LabTestSuiteSet
    dbo.FK_LabTestSpecialtyLabTestSuite
dbo.ContextItemSet
    Id int
    Name nvarchar
    Comment nvarchar
    Unit nvarchar
    DataType nvarchar
    Code nvarchar
    CodingSystem nvarchar
    ReferenceRange nvarchar
    dbo.PK_ContextItemSet
dbo.PatientSet
    Id int
    Name nvarchar
    BirthDay datetime
    Gender nvarchar
    PhotoURL nvarchar
    FK_EMR_Patient_Id nvarchar
    dbo.PK_PatientSet
dbo.EncounterSet
    Id int
    Admission datetime
    Discharge datetime
    Diagnosis nvarchar
    FK_EMR_Encounter_Id nvarchar
    Patient_Id int
    Profile_Id int
    dbo.PK_EncounterSet
    dbo.FK_PatientEncounter
    dbo.FK_ProfileEncounter
dbo.ChangeRecordSet
    Id int
    Operator nvarchar
    TimeStamp datetime
    OldState nvarchar
    NewState nvarchar
    Reason nvarchar
    ClinicalProblemInstanceChangeRecord_ChangeRecord_Id int
    dbo.PK_ChangeRecordSet
    dbo.FK_ClinicalProblemInstanceChangeRecord
dbo.FactSet
    Id int
    NumericValue float
    BooleanValue bit
    StringValue nvarchar
    IsAbnormal bit
    Confidentiality decimal
    LifeSpan nvarchar
    Evidence_Id int
    ContextItem_Id int
    ProfileFact_Fact_Id int
    ChangeRecordFact_Fact_Id int
    dbo.PK_FactSet
    dbo.FK_EvidenceFact
    dbo.FK_ContextItemFact
    dbo.FK_ProfileFact
    dbo.FK_ChangeRecordFact
dbo.ProfileSet
    Id int
    dbo.PK_ProfileSet
dbo.EBMSet
    Id int
    EvidenceLevel nvarchar
    RecommendationClass nvarchar
    Content nvarchar
    Source nvarchar
    URL nvarchar
    dbo.PK_EBMSet
dbo.MedicalOrderInstanceSet
    TimeStamp datetime
    FK_EMR_Order_Id nvarchar
    Id int
    MedicalOrder_Id int
    ClinicalProblemInstanceMedicalOrderInstance_MedicalOrderInstance_Id int
    dbo.PK_MedicalOrderInstanceSet
    dbo.FK_MedicalOrderInstanceMedicalOrder
    dbo.FK_ClinicalProblemInstanceMedicalOrderInstance
dbo.PlanInstanceSet
    State nvarchar
    CurrentPhase nvarchar
    Id int
    Plan_Id int
    ClinicalProblemInstancePlanInstance_PlanInstance_Id int
    dbo.PK_PlanInstanceSet
    dbo.FK_PlanInstancePlan
    dbo.FK_ClinicalProblemInstancePlanInstance
dbo.ContextItemSet_LabTestItem
    Id int
    LabTestSuiteLabTestItem_LabTestItem_Id int
    dbo.PK_ContextItemSet_LabTestItem
    dbo.FK_LabTestSuiteLabTestItem
    dbo.FK_LabTestItem_inherits_ContextItem
dbo.ContextItemSet_PhysiologicalItem
    Id int
    dbo.PK_ContextItemSet_PhysiologicalItem
    dbo.FK_PhysiologicalItem_inherits_ContextItem
dbo.ContextItemSet_Finding
    Id int
    dbo.PK_ContextItemSet_Finding
    dbo.FK_Finding_inherits_ContextItem
dbo.ContextItemSet_DemographicItem
    Id int
    dbo.PK_ContextItemSet_DemographicItem
    dbo.FK_DemographicItem_inherits_ContextItem
dbo.ContextItemSet_ProblemItem
    Id int
    dbo.PK_ContextItemSet_ProblemItem
    dbo.FK_ProblemItem_inherits_ContextItem
dbo.ClinicalProblemContextItem
    ClinicalProblemContextItem_ContextItem_Id int
    ContextItem_Id int
    dbo.PK_ClinicalProblemContextItem
    dbo.FK_ClinicalProblemContextItem_ClinicalProblem
    dbo.FK_ClinicalProblemContextItem_ContextItem
dbo.MedicalOrderEBM
    MedicalOrderEBM_EBM_Id int
    EBM_Id int
    dbo.PK_MedicalOrderEBM
    dbo.FK_MedicalOrderEBM_MedicalOrder
    dbo.FK_MedicalOrderEBM_EBM
dbo.TriggerRuleEBM
    TriggerRuleEBM_EBM_Name nvarchar
    TriggerRuleEBM_EBM_MajorVersion int
    TriggerRuleEBM_EBM_MinorVersion int
    EBM_Id int
    dbo.PK_TriggerRuleEBM
    dbo.FK_TriggerRuleEBM_TriggerRule
    dbo.FK_TriggerRuleEBM_EBM
dbo.PlanEBM
    PlanEBM_EBM_Id int
    EBM_Id int
    dbo.PK_PlanEBM
    dbo.FK_PlanEBM_Plan
    dbo.FK_PlanEBM_EBM
dbo.sp_upgraddiagrams
dbo.sysdiagrams
    name nvarchar
    principal_id int
    diagram_id int
    version int
    definition varbinary
    dbo.PK__sysdiagr__C2B05B6100551192
    dbo.UK_principal_name
Preparing XSL transformation...OK
Performing XSL transformation...OK

C:\>


Topic: SQLServer Fixed Roles

              Permissions of Fixed Database Roles (Database Engine) 

 SQL Server 2008 R2 


 Fixed database roles can be mapped to the more detailed permissions that are included in SQL Server. Fixed database roles are provided for convenience and backward compatibility. Assign more specific permissions whenever possible.  


 The following table describes the mapping of the fixed database roles to permissions. 



 Fixed database role   

 Database-level permission   

 Server-level permission   


 db_accessadmin  	

 Granted: ALTER ANY USER, CREATE SCHEMA 

 Granted with GRANT option: CONNECT  	

 Granted: VIEW ANY DATABASE   


 db_backupoperator  	

 Granted: BACKUP DATABASE, BACKUP LOG, CHECKPOINT  	

 Granted: VIEW ANY DATABASE   


 db_datareader  	

 Granted: SELECT  	

 Granted: VIEW ANY DATABASE   


 db_datawriter  	

 Granted: DELETE, INSERT, UPDATE  	

 Granted: VIEW ANY DATABASE   


 db_ddladmin  	

 Granted: ALTER ANY ASSEMBLY, ALTER ANY ASYMMETRIC KEY, ALTER ANY CERTIFICATE, ALTER ANY CONTRACT, ALTER ANY DATABASE DDL TRIGGER, ALTER ANY DATABASE EVENT, NOTIFICATION, ALTER ANY DATASPACE, ALTER ANY FULLTEXT CATALOG, ALTER ANY MESSAGE TYPE, ALTER ANY REMOTE SERVICE BINDING, ALTER ANY ROUTE, ALTER ANY SCHEMA, ALTER ANY SERVICE, ALTER ANY SYMMETRIC KEY, CHECKPOINT, CREATE AGGREGATE, CREATE DEFAULT, CREATE FUNCTION, CREATE PROCEDURE, CREATE QUEUE, CREATE RULE, CREATE SYNONYM, CREATE TABLE, CREATE TYPE, CREATE VIEW, CREATE XML SCHEMA COLLECTION, REFERENCES  	

 Granted: VIEW ANY DATABASE   


 db_denydatareader  	

 Denied: SELECT  	

 Granted: VIEW ANY DATABASE   


 db_denydatawriter  	

 Denied: DELETE, INSERT, UPDATE  	

 db_owner  	

 Granted with GRANT option: CONTROL  	

 Granted: VIEW ANY DATABASE   


 db_securityadmin  	

 Granted: ALTER ANY APPLICATION ROLE, ALTER ANY ROLE, CREATE SCHEMA, VIEW DEFINITION  	

 Granted: VIEW ANY DATABASE   


 dbm_monitor  	

 Granted: VIEW most recent status in Database Mirroring Monitor 

  Important 

 Thedbm_monitorfixed database role is created in themsdbdatabase when the first database is registered in Database Mirroring Monitor. The newdbm_monitorrole has no members until a system administrator assigns users to the role.    	

 Granted: VIEW ANY DATABASE     

 Fixed database roles are not equivalent to their database-level permission. For example, thedb_ownerfixed database role has theCONTROL DATABASEpermission. But granting theCONTROL DATABASEpermission does not make a user a member of thedb_ownerfixed database role. Members of thedb_ownerfixed database role are identified as thedbouser in the databases, but users with theCONTROL DATABASEpermission, are not.    
        
See More >>


Topic: SQLServerImportAndExportWizard

http://msdn.microsoft.com/en-us/library/ms189660.aspx

The purpose of the SQL Server Import and Export Wizard is to copy data from a source to a destination. The wizard can also create a destination database and destination tables for you. However, if you have to copy multiple databases or tables, or other kinds of database objects, you should use the Copy Database Wizard instead. For more information, see Use the Copy Database Wizard.

Options
--------------------------------------------------------------------------------

Source
Identifies the selected source table, view, or query.

Destination
Identifies the selected destination table, view, or query.

Create destination table/file
Specify whether to create the destination table if it does not already exist.

Delete rows in destination table/file
Specify whether to clear the data from an existing table before loading new data.

Append rows to destination table/file
Specify whether to append the new data to the data already present in an existing table.

Edit SQL
Use the default statement in the Create Table SQL Statement dialog box, or modify it for your purposes. If you modify this statement, you must also make associated changes to table mapping.

Drop and re-create destination table
Choose this option to overwrite the destination table. This option is only available when you use the wizard to create the destination table. The destination table is only dropped and re-created if you save the package that the wizard creates, and then run the package again.

Enable identity insert
Choose this option to allow existing identity values in the source data to be inserted into an identity column in the destination table. By default, the destination identity column does not allow this.

Mappings
Displays how each column in the data source maps to a column in the destination.

This list has the following columns:

Source
View each source column for which you can set transformation parameters. 

Destination
Specify whether you want to ignore a column during the copy operation. You can copy only a subset of columns by selecting <ignore> in this column for columns that you want to skip. Before you map columns, you must ignore all columns that will not be mapped. 

Type
Select a data type for the column. 

Nullable
Specify whether a column will allow a null value. 

Size
Specify the number of characters in the column. 

Precision
Specify the precision of displayed data, referring to the number of digits. 

Scale
Specify the scale of displayed data, referring to the number of decimal places. 


Topic: Repository Design Pattern

Show >>


Topic: BriefSteps

Add New Item "ADO.Net Entity Data Model"

Add Entities and Associations in the designer

In "Model Browser" windows, right click "Generate Database From Model..."
The product of this step is a .sql file of DDL.

In "Model Browser" windows, right click "Add Code Generation Item..."

Choose "EF4.x DbContext Generator"
The product of this step is .tt files and auto-gen-ed .cs code files
* Note: The .tt files will be put in root dir of current project. You may need to manually move them to specific namespace folder, e.g. Models. 

Benefit:
In future, if model needs update, just modify model .edmx, and re-generate database schema, and re-generate .tt files.

Entity Framework provides a MODEL-DRIVEN Development or DDD method.

NOTE: 
The reason why not use association in part of entity model, because the auto-generated db tables cannot generate correct column names for (*-*) association


Topic: Entity Framework Performance Up

Entity Framework 视图+索引 的方式可以显著提升性能

例如,

-- The SCHEMABINDING option protects a view against modifications to the schema of the underlying tables, at least modifications that would invalidate that view
CREATE VIEW [dbo].[VW_Fact] WITH SCHEMABINDING
AS
SELECT   dbo.ContextItemDefinition.Name, dbo.ContextItemDefinition.Unit, dbo.ContextItemDefinition.DataType, 
                dbo.ContextItemDefinition.Code, dbo.ContextItemDefinition.CodingSystem, dbo.ContextItemDefinition.ReferenceRange, 
                dbo.ContextItemDefinition.UpperBound, dbo.ContextItemDefinition.SemanticType, 
                dbo.ContextItemDefinition.ClinicalSignificance, dbo.ContextItemDefinition.Abbreviation, 
                dbo.ContextItemDefinition.Source, dbo.ContextItemDefinition.NLP_Profile_Id, dbo.ContextItemDefinition.FK_EMR_Id, 
                dbo.ContextItemDefinition.Hierarchy, dbo.ContextItemDefinition.TimeStamp AS Expr1, dbo.ContextItemDefinition.Author, 
                dbo.ContextItemDefinition.Type, dbo.ContextItemDefinition.NavigationPath, dbo.ContextItemDefinition.Status, 
                dbo.ContextItemDefinition.Version, dbo.ContextItemDefinition.InputCode, dbo.ContextItemDefinition.DefaultValue, 
                dbo.ContextItemDefinition.LowerBound, dbo.Fact.Id, dbo.Fact.NumericValue, dbo.Fact.BooleanValue, 
                dbo.Fact.StringValue, dbo.Fact.Confidence, dbo.Fact.LifeSpan, dbo.Fact.Abnormity, dbo.Fact.FK_EMR_Encounter_Id, 
                dbo.Fact.TimeStamp, dbo.Fact.ContextItemDefinition_Id, dbo.ContextItemDefinition.PrimaryItem_Id, dbo.Fact.Event_Id, 
                dbo.Fact.ChangeRecord_Id
FROM      dbo.Fact INNER JOIN
                dbo.ContextItemDefinition ON dbo.Fact.ContextItemDefinition_Id = dbo.ContextItemDefinition.Id

GO

CREATE UNIQUE CLUSTERED INDEX IX_Fact
ON VW_Fact(Id)
-- WITH (FILLFACTOR = 100) -- A FILLFACTOR which is set too high can affect performance on insert/update operations. You will get page splits when you modify enough data. This is part of the intrinsic nature of the B-tree index structure -- it automatically balances the tree. If you want to adjust the FILLFACTOR to an explicit setting, BOL says that this only takes effect when you rebuild the index.


CREATE NONCLUSTERED INDEX [IX_Encounter] ON [dbo].[VW_Fact]
(
	[FK_EMR_Encounter_Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO

对Fact表处理后,性能提升在5倍以上 view_index_before.png  view_index_after.png 


Topic: Entity Framework Profiler

Add a reference to the HibernatingRhinos.Profiler.Appender DLL in your application. The DLL can be found in the tool folder.

In the application start, you will call the following line of code: 
HibernatingRhinos.Profiler.Appender.EntityFramework.EntityFrameworkProfiler.Initialize();

Now start the profiler application and then run your application and get the profiler input.


Reference: http://www.codeproject.com/Articles/101214/EFProf-Profiler-Tool-for-Entity-Framework


Topic: EntityFramework.Profiler-v2.0-Build-2066

Show >>


Topic: for multiple level db query, eager load can outperform lazy load


Topic: ForeignKey

FK is auto generated by EF.
There seems no effortless way to rename FK.
Even editing .edmx file directly, cannot achieve the goal.
Just live with it.

LabTestSuiteLabTestSpecialty_LabTestSuite_Id
better be:
FK_LabTestSuite_LabTestSpecialty


LabTestSuiteLabTestItem_LabTestItem_Id
better be:
FK_LabTestItem_LabTestSuite


ClinicalProblemContextItem_ContextItem_Id
better be:
ClinicalProblem_Id


RuleIntervention_Intervention_Id
better be:
PK_Intervention_ClinicalProblem

InterventionPlan_Plan_Id
better be:
PK_Plan_Intervention

PlanPhase_Phase_Id
better be:
FK_Phase_Plan


Topic: MultipleEntityModels

Caution:

When using multiple entity models, only one model can use ObjectContext, others should use DBContext.
Otherwise, entity definition not found exception will be thrown.





Topic: OpenSqlFileBug

Phenomenon: 
When open .sql file inside IDE, deven becomes responseless.

Solution:
It's caused by SQL Server 2008 R2 update after VS2010 installation.
Run DACProjectSystemSetup_enu.msi from VS2010 SP1 ISO.


Topic: SerializationSupport

Original .tt file generated class doesn't support DataContractSerializer function. 
Modify the .tt file, as follows:


....

foreach (var entity in ItemCollection.GetItems<EntityType>().OrderBy(e => e.Name))
{
    fileManager.StartNewFile(entity.Name + ".cs");
    BeginNamespace(namespaceName, code);
#>
[DataContract]
<#=Accessibility.ForType(entity)#> <#=code.SpaceAfter(code.AbstractOption(entity))#>partial class <#=code.Escape(entity)#><#=code.StringBefore(" : ", code.Escape(entity.BaseType))#>
{
...

using System;
using System.Collections.Generic;
using System.Runtime.Serialization;

... 

void WriteProperty(string accessibility, string type, string name, string getterAccessibility, string setterAccessibility)
{
#>
    [DataMember]
    <#=accessibility#> <#=type#> <#=name#> { <#=getterAccessibility#>get; <#=setterAccessibility#>set; }
<#+
}
...


Topic: view_index_after


Topic: view_index_before


Topic: Deploy Filezilla Server in NAT

Use custom PASV settings if you are operating the server from behind a NAT router or a firewall. 
In that case, the IP address of the server is not accessible from outside of the router, so you should fill in the correct address here. 
Use the port range to limit the number of ports that will need to be forwarded through the router.

Use these "Passive Mode Settings":

Use custom port range: 50000~ 51000
Retrieve external ip address from: http://ip.filezilla-project.org/ip.php

Open the port in the firewall setting


Topic: Event

                 Event Object 

 A standard JavaScript object that FullCalendar uses to store information about a calendar event. Here are its properties:  

  id 	

 String/Integer. Optional 

 Uniquely identifies the given event. Different instances of repeating events should all have the sameid.   

  title 	

 String.Required. 

 The text on an events element   

  allDay 	

  trueorfalse. Optional. 

 Whether an event occurs at a specific time-of-day. This property affects whether an events time is shown. Also, in the agenda views, determines if it is displayed in the all-day section. 

 If this value is not explicitly specified,allDayDefaultwill be used if it is defined. 

 If all else fails, FullCalendar will try to guess. If either thestartorendvalue has aTas part of the ISO8601 date string,allDaywill becomefalse. Otherwise, it will betrue. 

 Dont include quotesaround yourtrue/false. This value is a boolean, not a string!   

  start 	

 The date/time an event begins.Required. 

 AMoment-ish input, like anISO8601 string. Throughout the API this will become a real Moment object.   

  end 	

 Theexclusivedate/time an event ends. Optional. 

 AMoment-ish input, like anISO8601 string. Throughout the API this will become a real Moment object. 

 It is the moment immediatelyafterthe event has ended. For example, if the last full day of an event isThursday, the exclusive end of the event will be 00:00:00 onFriday!   

  url 	

 String. Optional. 

 A URL that will be visited when this event is clicked by the user. For more information on controlling this behavior, see theeventClickcallback.   

  className 	

 String/Array. Optional. 

 A CSS class (or array of classes) that will be attached to this events element.   

  editable 	

  trueorfalse. Optional. 

 Overrides the mastereditableoption for this single event.   

  startEditable 	

  trueorfalse. Optional. 

 Overrides the mastereventStartEditableoption for this single event.   

  durationEditable 	

  trueorfalse. Optional. 

 Overrides the mastereventDurationEditableoption for this single event.   

  rendering 	

 Allows alternate rendering of the event, likebackground events. 

 Can be empty,background, orinverse-background   

  overlap 	

  trueorfalse. Optional. 

 Overrides the mastereventOverlapoption for this single event. 

 Iffalse, prevents this event from being dragged/resized over other events. Also prevents other events from being dragged/resized over this event.   

  constraint 	

 an event ID,businessHours, object. Optional. 

 Overrides the mastereventConstraintoption for this single event.   

  source 	

 Event Source Object. Automatically populated. 

 A reference to the event source that this event came from.   

  color 	

 Sets an events backgroundandborder color just like the calendar-wideeventColoroption.   

  backgroundColor 	

 Sets an events background color just like the calendar-wideeventBackgroundColoroption.   

  borderColor 	

 Sets an events border color just like the the calendar-wideeventBorderColoroption.   

  textColor 	

 Sets an events text color just like the calendar-wideeventTextColoroption.     
        
See More >>


Topic: Documentation

*** Instruction on generating javascript document ***

# Enter jsdoc-toolkit folder
# Copy js files to Scripts folder 
# Execute run.bat 
# The document is generated in out folder


Topic: JavaScript Reserved Words

            

 You should avoid using these reserved words and keywords as function or variable names as Javascript has reserved these words for its own use.  

  JavaScript Reserved Words  

	 break 	 export 	 return  

	 case 	 for 	 switch  

	 comment 	 function 	 this  

	 continue 	 if 	 typeof  

	 default 	 import 	 var  

	 delete 	 in 	 void  

	 do 	 label 	 while  

	 else 	 new 	 with    

  Java Keywords (Reserved by JavaScript)  

	 abstract 	 implements 	 protected  

	 boolean 	 instanceOf 	 public  

	 byte 	 int 	 short  

	 char 	 interface 	 static  

	 double 	 long 	 synchronized  

	 false 	 native 	 throws  

	 final 	 null 	 transient  

	 float 	 package 	 true  

	 goto 	 private 	

  ECMAScipt Reserved Words  

	 catch 	 enum 	 throw  

	 class 	 extends 	 try  

	 const 	 finally 	

	 debugger 	 super 	

  Other JavaScript Keywords  

	 alert 	 isFinite 	 personalbar  

	 Anchor 	 isNan 	 Plugin  

	 Area 	 java 	 print  

	 arguments 	 JavaArray 	 prompt  

	 Array 	 JavaClass 	 prototype  

	 assign 	 JavaObject 	 Radio  

	 blur 	 JavaPackage 	 ref  

	 Boolean 	 length 	 RegExp  

	 Button 	 Link 	 releaseEvents  

	 callee 	 location 	 Reset  

	 caller 	 Location 	 resizeBy  

	 captureEvents 	 locationbar 	 resizeTo  

	 Checkbox 	 Math 	 routeEvent  

	 clearInterval 	 menubar 	 scroll  

	 clearTimeout 	 MimeType 	 scrollbars  

	 close 	 moveBy 	 scrollBy  

	 closed 	 moveTo 	 scrollTo  

	 confirm 	 name 	 Select  

	 constructor 	 NaN 	 self  

	 Date 	 navigate 	 setInterval  

	 defaultStatus 	 navigator 	 setTimeout  

	 document 	 Navigator 	 status  

	 Document 	 netscape 	 statusbar  

	 Element 	 Number 	 stop  

	 escape 	 Object 	 String  

	 eval 	 onBlur 	 Submit  

	 FileUpload 	 onError 	 sun  

	 find 	 onFocus 	 taint  

	 focus 	 onLoad 	 Text  

	 Form 	 onUnload 	 Textarea  

	 Frame 	 open 	 toolbar  

	 Frames 	 opener 	 top  

	 Function 	 Option 	 toString  

	 getClass 	 outerHeight 	 unescape  

	 Hidden 	 OuterWidth 	 untaint  

	 history 	 Packages 	 unwatch  

	 History 	 pageXoffset 	 valueOf  

	 home 	 pageYoffset 	 watch  

	 Image 	 parent 	 window  

	 Infinity 	 parseFloat 	 Window  

	 InnerHeight 	 parseInt 	

	 InnerWidth 	 Password 	    
        
See More >>


Topic: pdf.js

Home page
https://mozilla.github.io/pdf.js/

A demo
https://mozilla.github.io/pdf.js/web/viewer.html


Topic: JQuery Selector

                 jQuery Selectors  

  Selector  Example  Selects  

	 * 	 $(*) 	 All elements  

	 #id 	 $(#lastname) 	 The element with id=lastname  

	 .class 	 $(.intro) 	 All elements with class=intro  

	 .class,.class 	 $(.intro,.demo) 	 All elements with the class intro or demo  

	 element 	 $(p) 	 All <p> elements  

	 el1,el2,el3 	 $(h1,div,p) 	 All <h1>, <div> and <p> elements  

	  	

	 :first 	 $(p:first) 	 The first <p> element  

	 :last 	 $(p:last) 	 The last <p> element  

	 :even 	 $(tr:even) 	 All even <tr> elements  

	 :odd 	 $(tr:odd) 	 All odd <tr> elements  

	  	

	 :first-child 	 $(p:first-child) 	 All <p> elements that are the first child of their parent  

	 :first-of-type 	 $(p:first-of-type) 	 All <p> elements that are the first <p> element of their parent  

	 :last-child 	 $(p:last-child) 	 All <p> elements that are the last child of their parent  

	 :last-of-type 	 $(p:last-of-type) 	 All <p> elements that are the last <p> element of their parent  

	 :nth-child(n) 	 $(p:nth-child(2)) 	 All <p> elements that are the 2nd child of their parent  

	 :nth-last-child(n) 	 $(p:nth-last-child(2)) 	 All <p> elements that are the 2nd child of their parent, counting from the last child  

	 :nth-of-type(n) 	 $(p:nth-of-type(2)) 	 All <p> elements that are the 2nd <p> element of their parent  

	  :nth-last-of-type(n) 	 $(p:nth-last-of-type(2)) 	 All <p> elements that are the 2nd <p> element of their parent, counting from the last child  

	 :only-child 	 $(p:only-child) 	 All <p> elements that are the only child of their parent  

	 :only-of-type 	 $(p:only-of-type) 	 All <p> elements that are the only child, of its type, of their parent  

	  	

	 parent > child 	 $(div > p) 	 All <p> elements that are a direct child of a <div> element  

	 parent descendant 	 $(div p) 	 All <p> elements that are descendants of a <div> element  

	 element + next 	 $(div + p) 	 The <p> element that are next to each <div> elements  

	 element ~ siblings 	 $(div ~ p) 	 All <p> elements that are siblings of a <div> element  

	  	

	 :eq(index) 	 $(ul li:eq(3)) 	 The fourth element in a list (index starts at 0)  

	 :gt(no) 	 $(ul li:gt(3)) 	 List elements with an index greater than 3  

	 :lt(no) 	 $(ul li:lt(3)) 	 List elements with an index less than 3  

	 :not(selector) 	 $(input:not(:empty)) 	 All input elements that are not empty  

	  	

	 :header 	 $(:header) 	 All header elements <h1>, <h2> ...  

	 :animated 	 $(:animated) 	 All animated elements  

	 :focus 	 $(:focus) 	 The element that currently has focus  

	 :contains(text) 	 $(:contains(Hello)) 	 All elements which contains the text Hello  

	 :has(selector) 	 $(div:has(p)) 	 All <div> elements that have a <p> element  

	 :empty 	 $(:empty) 	 All elements that are empty  

	 :parent 	 $(:parent) 	 All elements that are a parent of another element  

	 :hidden 	 $(p:hidden) 	 All hidden <p> elements  

	 :visible 	 $(table:visible) 	 All visible tables  

	 :root 	 $(:root) 	 The document’s root element  

	 :lang(language) 	 $(p:lang(de)) 	 All <p> elements with a lang attribute value starting with de  

	  	

	 [attribute] 	 $([href]) 	 All elements with a href attribute  

	 [attribute=value] 	 $([href=default.htm]) 	 All elements with a href attribute value equal to default.htm  

	 [attribute!=value] 	 $([href!=default.htm]) 	 All elements with a href attribute value not equal to default.htm  

	 [attribute$=value] 	 $([href$=.jpg]) 	 All elements with a href attribute value ending with .jpg  

	 [attribute|=value] 	 $([hreflang|=en]) 	 All elements with a hreflang attribute value starting with en  

	 [attribute^=value] 	 $([name^=hello]) 	 All elements with a name attribute value starting with hello  

	 [attribute~=value] 	 $([name~=hello]) 	 All elements with a name attribute value containing the word hello  

	  [attribute*=value] 	 $([name*=hello]) 	 All elements with a name attribute value containing the string hello  

	  	

	 :input 	 $(:input) 	 All input elements  

	 :text 	 $(:text) 	 All input elements with type=text  

	 :password 	 $(:password) 	 All input elements with type=password  

	 :radio 	 $(:radio) 	 All input elements with type=radio  

	 :checkbox 	 $(:checkbox) 	 All input elements with type=checkbox  

	 :submit 	 $(:submit) 	 All input elements with type=submit  

	 :reset 	 $(:reset) 	 All input elements with type=reset  

	 :button 	 $(:button) 	 All input elements with type=button  

	 :image 	 $(:image) 	 All input elements with type=image  

	 :file 	 $(:file) 	 All input elements with type=file  

	 :enabled 	 $(:enabled) 	 All enabled input elements  

	 :disabled 	 $(:disabled) 	 All disabled input elements  

	 :selected 	 $(:selected) 	 All selected input elements  

	 :checked 	 $(:checked) 	 All checked input elements    
        
See More >>


Topic: install jupyterhub

https://pypi.python.org/pypi/jupyterhub/0.2.0


sudo apt-get install python3.5
sudo apt-get install python3-pip
sudo apt-get install python3-dev

sudo apt-get install npm nodejs-legacy
sudo apt-get install python3-pip
sudo -H pip3 install --upgrade pip

sudo npm install -g configurable-http-proxy
sudo -H pip3 install jupyterhub
sudo -H pip3 install --upgrade notebook
sudo -H pip3 install scipy # numpy matplotlib ...


#### MAY NOT BE NECESSARY #####

git clone https://github.com/jupyter/jupyterhub.git
cd jupyterhub
sudo pip3 install -r requirements.txt
sudo -H pip3 install .

sudo -H pip3 install -r dev-requirements.txt -e .
sudo -H python3 setup.py js # fetch updated client-side js (changes rarely)
sudo -H python3 setup.py css # recompile CSS from LESS sources

#### MAY NOT BE NECESSARY #####


# start up
jupyterhub 


# generate jupyterhub_config.py
jupyterhub --generate-config
# revise
c.Spawner.notebook_dir = '~/Documents/vault_285714/notebooks'



# install Octave kernel
sudo apt-get install gnuplot
sudo apt-get install octave
sudo -H pip3 install octave_kernel
sudo python3 -m octave_kernel.install  # don't use python, use python3



# install C kernel
 * `pip install jupyter-c-kernel`
 * `install_c_kernel`
 * `jupyter-notebook`. Enjoy!


Topic: LaTeX_Math

Show >>


Topic: NLog Layout

             Layout Renderers  
${asp-application}- ASP Application variable. 
 ${aspnet-application}- ASP.NET Application variable. 
${aspnet-request}- ASP.NET Request variable. 
${aspnet-session}- ASP.NET Session variable. 
${aspnet-sessionid}- ASP.NET Session ID. 
 ${aspnet-user-authtype}- ASP.NET User variable. 
 ${aspnet-user-identity}- ASP.NET User variable. 
${asp-request}- ASP Request variable. 
${asp-session}- ASP Session variable. 
${assembly-version}- The version of the executable in the default application domain. 
${basedir}- The current application domains base directory. 
${callsite}- The call site (class name, method name and source information). 
${counter}- A counter value (increases on each layout rendering). 
${date}- Current date and time. 
${document-uri}- URI of the HTML page which hosts the current Silverlight application. 
${environment}- The environment variable. 
${event-context}- Log event context data. 
${exception}- Exception information provided through a call to one of the Logger.*Exception() methods. 
${file-contents}- Renders contents of the specified file. 
${gc}- The information about the garbage collector. 
${gdc}- Global Diagnostic Context item. Provided for compatibility with log4net. 
${guid}- Globally-unique identifier (GUID). 
${identity}- Thread identity information (name and authentication information). 
${install-context}- Installation parameter (passed to InstallNLogConfig). 
${level}- The log level. 
${literal}- A string literal. 
${log4jxmlevent}- XML event description compatible with log4j, Chainsaw and NLogViewer. 
${logger}- The logger name. 
${longdate}- The date and time in a long, sortable format yyyy-MM-dd HH:mm:ss.mmm. 
${machinename}- The machine name that the process is running on. 
${mdc}- Mapped Diagnostic Context item. Provided for compatibility with log4net. 
${message}- The formatted log message. 
${ndc}- Nested Diagnostic Context item. Provided for compatibility with log4net. 
${newline}- A newline literal. 
${nlogdir}- The directory where NLog.dll is located. 
 ${performancecounter}- The performance counter. 
${processid}- The identifier of the current process. 
${processinfo}- The information about the running process. 
${processname}- The name of the current process. 
${processtime}- The process time in format HH:mm:ss.mmm. 
${qpc}- High precision timer, based on the value returned from QueryPerformanceCounter() optionally converted to seconds. 
${registry}- A value from the Registry. 
${shortdate}- The short date in a sortable format yyyy-MM-dd. 
${sl-appinfo}- Information about Silverlight application. 
${specialfolder}- System special folder path (includes My Documents, My Music, Program Files, Desktop, and more). 
${stacktrace}- Stack trace renderer. 
${tempdir}- A temporary directory. 
${threadid}- The identifier of the current thread. 
${threadname}- The name of the current thread. 
${ticks}- The Ticks value of current date and time. 
${time}- The time in a 24-hour, sortable format HH:mm:ss.mmm. 
${windows-identity}- Thread Windows identity information (username).    Wrapper Layout Renderers  
${cached}- Applies caching to another layout output. 
 ${filesystem-normalize}- Filters characters not allowed in the file names by replacing them with safe character. 
${json-encode}- Escapes output of another layout using JSON rules. 
${lowercase}- Converts the result of another layout output to lower case. 
${onexception}- Only outputs the inner layout when exception has been defined for log message. 
${pad}- Applies padding to another layout output. 
${replace}- Replaces a string in the output of another layout with another string. 
${rot13}- Decodes text encrypted with ROT-13. 
${trim-whitespace}- Trims the whitespace from the result of another layout renderer. 
${uppercase}- Converts the result of another layout output to upper case. 
${url-encode}- Encodes the result of another layout output for use with URLs. 
${when}- Only outputs the inner layout when the specified condition has been met. 
${whenEmpty}- Outputs alternative layout when the inner layout produces empty result. 
${xml-encode}- Converts the result of another layout output to be XML-compliant.    Custom Layout Renderers  
${gelf}- Converts log toGELFformat.    Passing Custom Values to a Layout 

 Even though the layout renderers provide many pre-defined values, you may need to pass application specific values to yourLayouts. You can pass your own values in code by adding custom Properties to the event. You then retrieve the value using the${event-context}renderer. See the documentation for the${event-context}for an example. 


  Event Context Layout Renderer  

 Log event context data. 

 Supported in .NET, Silverlight, Compact Framework and Mono.   Configuration Syntax 


${event-context:item=String}     Parameters   Rendering Options  
item- Name of the item. Required.    Example 

 In C# class, create an event and add an element to the Properties dictionary (or the deprecated Context dictionary): 


... Logger log = LogManager.GetCurrentClassLogger(); LogEventInfo theEvent = new LogEventInfo(LogLevel.Debug, , Pass my custom value theEvent.Properties[MyValue] = My custom string // deprecated theEvent.Context[TheAnswer] = 42; log.Log(theEvent); ...   

 and in your NLog.config file: 


${event-context:item=MyValue} -- renders My custom string ${event-context:item=TheAnswer} -- renders 42      
        
See More >>


Topic: NLog

            

  Tutorial  



   Installing NLog 

 NLog can be downloaded from the theDownloadsection or fromNuGet. Both source and binary packages are available. For beginners, it is a good idea to use the recommended installer package in msi or exe format. More advanced scenarios may require the use of zip files, which are also available. 

 At this time NLog 2.0 is stable and can be tried for production applications. 

 Once you have downloaded the installation package, run it and choose default installation options. This will deploy NLog assemblies, API documentation and Visual Studio integration, which includes:  
Item Templates - to quickly add NLogConfiguration Fileto your project 
Code Snippets - for C# and Visual Basic, to simplify creation of Logger instances 
NLog.xsd file Intellisense(tm) support for editing NLog configuration files registration of NLog binary directory intoAdd Reference...dialog    Adding NLog to an application 

 Lets start by creating an empty console project in Visual Studio. 

 In order to use NLog in the application, we must add a reference to NLog.dll and aConfiguration Filewhich will specify Log Routing rules. Those two things can be done in a single step, just by adding NLog.config file to the project. To do so, right click on your project in Visual Studio and selectAdd New Item. 

 From the list on the left, selectNLogcategory, then selectEmpty NLog Configuration Filethen clickAdd. 


 This will add a reference to NLog.dll from C:\Program Files\NLog and aConfiguration Filewithout any rules. Since NLog requires the configuration file to be copied to the application directory, we must make just one change. In Solution Explorer select NLog.config and setCopy to outputdirectory toCopy always. 


 Thats it, you can now compile and run your application and it will be able to use NLog.   Logging API 

 In order to emit log messages from the application you need to use the logging API. There are two classes that you will be using the most: Logger and LogManager, both in the NLog namespace. Logger represents the named source of logs and has methods to emit log messages, and LogManager creates and manages instances of loggers. 

 It is important to understand that Logger does not represent any particular log output (and thus is never tied to a particular log file, etc.) but is only a source, which typically corresponds to a class in your code. Mapping from log sources to outputs is defined separately throughConfiguration FileorConfiguration API. Maintaining this separation lets you keep logging statements in your code and easily change how and where the logs are written, just by updating the configuration in one place.   Creating loggers 

 Most applications will use one logger per class, where the name of the logger is the same as the name of the class. As mentioned before, you must use LogManager to create Logger instances. To create a logger with a given name, call: 


using NLog; Logger logger = LogManager.GetLogger(MyClassName   

 In most cases you will have one logger per class, so it makes sense to give logger the same name as the current class.LogManagerexposes a method which creates a logger for current class, calledGetCurrentClassLogger(). Because loggers are thread-safe, you can simply create the logger once and store it in a static variable: 


namespace MyNamespace { public class MyClass { private static Logger logger = LogManager.GetCurrentClassLogger(); } }   

 TIP: Instead of writing logger declaration by hand, you can use Visual Studio code snippet: Just type nlogger and press TAB key twice, which will insert the snippet.   Log levels 

 Each log message has associated log level, which identifies how important/detailed the message is. NLog can route log messages based primarily on their logger name and log level. 

 NLog supports the followinglog levels:  
Trace - very detailed logs, which may include high-volume information such as protocol payloads. This log level is typically only enabled during development 
Debug - debugging information, less detailed than trace, typically not enabled in production environment. 
Info - information messages, which are normally enabled in production environment 
Warn - warning messages, typically for non-critical issues, which can be recovered or which are temporary failures 
Error - error messages 
Fatal - very serious errors    Emitting log messages 

 In order to emit log message you can simply call one of the methods on theLogger.Loggerclass has six methods whose names correspond to log levels:Trace(),Debug(),Info(),Warn(),Error()andFatal(). There is alsoLog()method which takes log level as a parameter. 


using NLog; public class MyClass { private static Logger logger = LogManager.GetCurrentClassLogger(); public void MyMethod1() { logger.Trace(Sample trace message logger.Debug(Sample debug message logger.Info(Sample informational message logger.Warn(Sample warning message logger.Error(Sample error message logger.Fatal(Sample fatal error message // alternatively you can call the Log() method  // and pass log level as the parameter. logger.Log(LogLevel.Info, Sample fatal error message } }   

 Log messages can also be parameterized - you can use the same format strings as when usingConsole.WriteLine()andString.Format(): 


using NLog; public class MyClass { private static Logger logger = LogManager.GetCurrentClassLogger(); public void MyMethod1() { int k = 42; int l = 100; logger.Trace(Sample trace message, k={0}, l={1}, k, l); logger.Debug(Sample debug message, k={0}, l={1}, k, l); logger.Info(Sample informational message, k={0}, l={1}, k, l); logger.Warn(Sample warning message, k={0}, l={1}, k, l); logger.Error(Sample error message, k={0}, l={1}, k, l); logger.Fatal(Sample fatal error message, k={0}, l={1}, k, l); logger.Log(LogLevel.Info, Sample fatal error message, k={0}, l={1}, k, l); } }   

 TIP: You should avoiding doing string formatting (such as concatenation, or calling String.Format) yourself and instead use built-in formatting functionality of NLog. The main reason for this is performance: 

 Formatting log messages takes a lot of time, so NLog tries to defer formatting until it knows that log message will actually be written to some output. If the message ends up being skipped because of logging configuration, you will not pay the price ofString.Format()at all. See also Optimizing Logging Performance.   Configuration File 

 So far we have learned how to emit log messages from code, but we have not configured any outputs for out logs. So, when you run your instrumented application at this point, you will see - well - nothing. Time to open the NLog.config file and add some logging rules:  

 In the <targets> section, add: 


<target name=logfile xsi:type=File fileName=file.txt />   This will define a target which will send logs to a file named file.txt. 

 In the <rules> section, add: 


<logger name=* minlevel=Info writeTo=logfile />   This snippet will direct all logs (name=*) of levelInfoor higher (which includesInfo,Warn,ErrorandFatal) to a target named logfile.  

 Note that as you are typing this in Visual Studio, you should see Intellisense suggesting attribute names and validating their values. The final configuration should look like this: 


<?xml version=1.0 encoding=utf-8 ?> <nlog xmlns=http://www.nlog-project.org/schemas/NLog.xsd xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance> <targets> <target name=logfile xsi:type=File fileName=file.txt /> </targets> <rules> <logger name=* minlevel=Info writeTo=logfile /> </rules> </nlog>   Now, when you run the application, you should see log messages written to file.txt in current directory.  Multiple targets 

 Lets try something more complex now. Imagine you want to send very detailed logs to a file, and you also want to see the logs in the console window, but slighly less detailed. Heres the configuration file which implements this: 


<?xml version=1.0 encoding=utf-8 ?> <nlog xmlns=http://www.nlog-project.org/schemas/NLog.xsd xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance> <targets> <target name=logfile xsi:type=File fileName=file.txt /> <target name=console xsi:type=Console /> </targets> <rules> <logger name=* minlevel=Trace writeTo=logfile /> <logger name=* minlevel=Info writeTo=console /> </rules> </nlog>   As you can see, we now have multiple targets and multiple rules which route logs to them.  Logger-specific routing 

 Another scnenario which is very common requires producing more detailed logs from some components which are being currently developed, while reducing output from some other components. We can use the following configuration: 


<?xml version=1.0 encoding=utf-8 ?> <nlog xmlns=http://www.nlog-project.org/schemas/NLog.xsd xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance> <targets> <target name=logfile xsi:type=File fileName=file.txt /> </targets> <rules> <logger name=SomeNamespace.Component.* minlevel=Trace writeTo=logfile final=true /> <logger name=* minlevel=Info writeTo=logfile /> </rules> </nlog>   First rule will send logs from loggers whose names begin withSomeNamespace.Component. and where level isTraceor higher to the log file. The attributefinal=truewill cause further processing to be stopped after performing the write.

 Second rule will send all remaining logs to the same log file with the restriction that the level must beInfoor higher.   Wrappers 

 NLog supports a special kind of targets, which dont do any logging by themselves, but which modify the behavior of other loggers. Those targets are called wrappers. The most commonly used ones are:  
 ImpersonatingWrapper- Impersonates another user for the duration of the write. 
AsyncWrapper- Provides asynchronous, buffered execution of target writes. 
FallbackGroup- Provides fallback-on-error. 
FilteringWrapper- Filters log entries based on a condition.  

 Many more wrappers are available. You can find the full listhere. 

 In order to use wrappers, simply enclose the <target /> element with another one representing a wrapper and use the name of the wrapper in your <rules/> section as in the following example: 


<?xml version=1.0 encoding=utf-8 ?> <nlog xmlns=http://www.nlog-project.org/schemas/NLog.xsd xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance> <targets> <target name=asyncFile xsi:type=AsyncWrapper> <target name=logfile xsi:type=File fileName=file.txt /> </target> </targets> <rules> <logger name=* minlevel=Info writeTo=asyncFile /> </rules> </nlog>   This will make all writes to the file be asynchronous, which will improve responsiveness of the calling thread.  Layouts 

 Layouts provide a way to format the contents of the logs as it is written to a file. There are 2 main kinds of layouts:  
simple layout - which is composed ofLayout Renderers 
structural layouts - which can output XML, CSV, and other complex formats  

 Simple layout is just a string, with special tags embedded between${and}. For example the following declaration will cause each log message to be prefixed with date formatted asyyyyMMddHHmmss: 


<target name=logfile xsi:type=File fileName=file.txt layout=${date:format=yyyyMMddHHmmss} ${message} />       


 Last edited by monomial,2 months ago  
        
See More >>


Topic: Entrophy_Keyword

                

 目前,对于文章中提取关键词最常用的方法莫过于TF-IDF,但是这样提取出来的关键词并不一定准确。 

 举个最简单的例子来说,在新闻中最前面出现“记者李元芳报道”,分词后的结果是“记者、李元芳、报道”,对于这三个词,“记者”和“报道”的经常出现,idf值一般来说可能会很低,而“李元芳”这个刚出道不久名不见经传的无名小辈可能对google免疫,造成的结果是idf值极高。尽管“李元芳”在文章中仅出现这一次,但足以奠定它是关键词老大的地位。 

 显然如果把“李元芳”作为文章关键词是错误的,至少也不应该排在前五位。于是有人想到可以用词频的方法来干掉“李元芳”,文中出现一次的统统不考虑,这样的方法在一些情况下有效,但是当文章很短,几乎每个词都仅出现一次的时候就提取不到任何关键词了。另一种方法是干掉idf值很高的,但是值多高才是高这又是一个问题。 

 细细分析来看,之所以出现这样的局面完全是idf在作怪。其实在求解idf的时候,需要得到每个词词频,而这又需要语料来统计。显然,语料是越多越好,无奈现实中我们得不到这么多的语料,所以只能从特殊到一般,这虽然有道理,但是不是很准确就难说了。 

 为了彻底解决这个问题,应该要做到不需要使用词频进行关键词提取。于是,聪明人士引入了信息熵的概念,具体可以看这里: 

   http://zh.wikipedia.org/wiki/%E7%86%B5_(%E4%BF%A1%E6%81%AF%E8%AE%BA) 

 前面都是铺垫,下面就说说如何在文章中利用信息熵提取关键词: 

 首先我们需要明确一点,一个词之所以能称为关键词,原因就在于这个词左右能搭配的词很丰富,于是我们可以定义一个词的信息熵: 


 W代表该词,p代表该词左右出现的不同词的数目。 

 比如现在某篇文章中出现了两次AWC,一次BWD 

 那么W的左侧信息熵为: 


 2/3表示词组A在3次中出现了2次,B只出现了一次,故为1/3. 

 W右侧的信息熵也是一样的。如果是AWC,BWC 

 那么W右侧就是0,因为是-1log(1)。 

 对所有的词计算左右信息熵,如果某个词的左右信息熵都很大,那这个词就很可能是关键词。 

 拿文章一开始提到的反例来说,“李元芳”只在开头出现了一次,于是信息熵为0,肯定不会是关键词了。 

 最后考虑一种特殊情况,如果某个词左侧的信息熵很大,右侧信息熵很小,而他右侧的词左侧信息熵很小,右侧信息熵很大。形象描述为XBCY,B与C经常一同出现,但是X和Y经常变化,于是可以把B和C组合起来当成一个关键词,我们常常见到“智能手机”作为一个关键词出现就是这个道理。这也涉及到NLP中另一个很有意思的研究方向-新词发现。 

 原文地址:http://blog.csdn.net/zhaoxinfan/article/details/12751405   
        
See More >>


Topic: ICTPOS3.0

                 计算所汉语词性标记集 

 Version 3.0 

   计算所汉语词性标记集...  PAGEREF _Toc34628482 \h  1 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400380032000000  

  0.  说明...  PAGEREF _Toc34628483 \h  1 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400380033000000  

  1.  名词 (1个一类,7个二类,5个三类)  PAGEREF _Toc34628484 \h  2 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400380034000000  

  2.  时间词(1个一类,1个二类)  PAGEREF _Toc34628485 \h  2 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400380035000000  

  3.  处所词(1个一类)  PAGEREF _Toc34628486 \h  3 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400380036000000  

  4.  方位词(1个一类)  PAGEREF _Toc34628487 \h  3 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400380037000000  

  5.  动词(1个一类,9个二类)  PAGEREF _Toc34628488 \h  3 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400380038000000  

  6.  形容词(1个一类,4个二类)  PAGEREF _Toc34628489 \h  3 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400380039000000  

  7.  区别词(1个一类,2个二类)  PAGEREF _Toc34628490 \h  3 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390030000000  

  8.  状态词(1个一类)  PAGEREF _Toc34628491 \h  3 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390031000000  

  9.  代词(1个一类,4个二类,6个三类)  PAGEREF _Toc34628492 \h  3 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390032000000  

  10.  数词(1个一类,1个二类)  PAGEREF _Toc34628493 \h  4 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390033000000  

  11.  量词(1个一类,2个二类)  PAGEREF _Toc34628494 \h  4 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390034000000  

  12.  副词(1个一类)  PAGEREF _Toc34628495 \h  4 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390035000000  

  13.  介词(1个一类,2个二类)  PAGEREF _Toc34628496 \h  4 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390036000000  

  14.  连词(1个一类,1个二类)  PAGEREF _Toc34628497 \h  4 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390037000000  

  15.  助词(1个一类,15个二类)  PAGEREF _Toc34628498 \h  4 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390038000000  

  16.  叹词(1个一类)  PAGEREF _Toc34628499 \h  4 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003400390039000000  

  17.  语气词(1个一类)  PAGEREF _Toc34628500 \h  5 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003500300030000000  

  18.  拟声词(1个一类)  PAGEREF _Toc34628501 \h  5 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003500300031000000  

  19.  前缀(1个一类)  PAGEREF _Toc34628502 \h  5 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003500300032000000  

  20.  后缀(1个一类)  PAGEREF _Toc34628503 \h  5 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003500300033000000  

  21.  字符串(1个一类,2个二类)  PAGEREF _Toc34628504 \h  5 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003500300034000000  

  22.  标点符号(1个一类,16个二类)  PAGEREF _Toc34628505 \h  5 08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000D0000005F0054006F006300330034003600320038003500300035000000  

     0.  说明 

 计算所汉语词性标记集(共计99个,22个一类,66个二类,11个三类)主要用于中国科学院计算技术研究所研制的汉语词法分析器、句法分析器和汉英机器翻译系统。本标记集主要参考了以下词性标记集: 

  1.  北大《人民日报》语料库词性标记集; 

  2.  北大2002新版词性标记集(草稿); 

  3.  清华大学汉语树库词性标记集; 

  4.  教育部语用所词性标记集(国家推荐标准草案2002版); 

  5.  美国宾州大学中文树库(ChinesePennTreeBank)词性标记集; 

  由于计算所的汉语词法分析器主要采用北大《人民日报》语料库进行参数训练,因此本 

  词性标记集主要以北大《人民日报》语料库的词性标记集为蓝本,并参考了北大《汉语语法信息词典》中给出的汉语词的语法信息。 

 本标记集在制定过程中主要考虑了以下几方面的因素: 

  1.  有助于提高汉语词法分析器的切分和标注正确率; 

  2.  有助于提高汉语句法分析器的正确率; 

  3.  有助于汉英机器翻译系统进行翻译; 

  4.  易于从北大《人民日报》语料库词性标记集进行转换; 

  5.  对于语法功能不同的词,在不造成词法分析和句法分析歧义区分困难的情况下,尽可能细分子类。 

  基于以上考虑,我们在标注过程中尽量避免那些容易出错的词性标记,而采用那些不容易出错、而对提高汉语词法句法分析正确率有明显作用的标记。例如,在动词的子类中,我们参考了宾州大学中文树库的做法,把汉语动词“是”和“有”分别做成单独的标记,而没有采用“系动词”的标记。因为同样是“是”这个动词,其句法功能很多,作“系动词”只是其中一种功能,而要区分这些功能是非常困难的,会导致词法分析的正确率下降。 

  在名词子类中,我们区分了“汉语人名”、“日语人名”和“翻译人名”,这不仅仅是因为这三种人名要采用不同的参数进行训练与识别,而且在汉英机器翻译中也要采用不同的分析算法进行翻译。又如,我们把表示时间的“数词+‘年’”(如“1995年”)合并成一个时间词,而表示年头的“数词+‘年’”分别标注为“数词”和“量词”,这是因为我们通过实验发现这种区分在词法分析阶段通过统计方法可以达到较高的正确率,而且这种区分对于后续的句法分析和机器翻译有非常重要的作用。 

  对于某些词类(助词和标点符号),基本上是一个封闭集,而这些词类中各个词的语法功能相差很大,在这种情况下,我们尽可能地细分其子类。 

  另外,与其他词性标记集类似,在我们的标记体系中,小类只是大类中一些有必要区分的一些特例,但小类的划分不满足完备性。    1.  名词 (1个一类,7个二类,5个三类) 

 名词分为以下子类: 

 n 名词 

 nr 人名 

 nr1 汉语姓氏 

 nr2 汉语名字 

 nrj 日语人名 

 nrf 音译人名 

 ns 地名 

 nsf 音译地名 

 nt 机构团体名 

 nz 其它专名 

 nl 名词性惯用语 

 ng 名词性语素    2.  时间词(1个一类,1个二类) 

 t 时间词 

 tg 时间词性语素    3.  处所词(1个一类) 

 s 处所词    4.  方位词(1个一类) 

 f 方位词    5.  动词(1个一类,9个二类) 

 v 动词 

 vd 副动词 

 vn 名动词 

 vshi 动词“是” 

 vyou 动词“有” 

 vf 趋向动词 

 vx 形式动词 

 vi 不及物动词(内动词) 

 vl 动词性惯用语 

 vg 动词性语素    6.  形容词(1个一类,4个二类) 

 a 形容词 

 ad 副形词 

 an 名形词 

 ag 形容词性语素 

 al 形容词性惯用语    7.  区别词(1个一类,2个二类) 

 b 区别词 


 bl 区别词性惯用语    8.  状态词(1个一类) 

 z 状态词    9.  代词(1个一类,4个二类,6个三类) 

 r 代词 

 rr 人称代词 

 rz 指示代词 

 rzt 时间指示代词 

 rzs 处所指示代词 

 rzv 谓词性指示代词 

 ry 疑问代词 

 ryt 时间疑问代词 

 rys 处所疑问代词 

 ryv 谓词性疑问代词 

 rg 代词性语素    10.  数词(1个一类,1个二类) 

 m 数词 

 mq 数量词    11.  量词(1个一类,2个二类) 

 q 量词 

 qv 动量词 

 qt 时量词    12.  副词(1个一类) 

 d 副词    13.  介词(1个一类,2个二类) 

 p 介词 

 pba 介词“把” 

 pbei 介词“被”    14.  连词(1个一类,1个二类) 

 c 连词 

  cc  并列连词    15.  助词(1个一类,15个二类) 

 u 助词 

 uzhe 着 

 ule 了  喽 

 uguo 过 

 ude1 的  底 

 ude2 地 

 ude3 得 

 usuo 所 

 udeng 等  等等  云云 

 uyy 一样  一般  似的  般 

 udh 的话 

 uls 来讲  来说  而言  说来 


 uzhi 之 

 ulian 连  (“连小学生都会”) 

     16.  叹词(1个一类) 

 e 叹词    17.  语气词(1个一类) 

 y 语气词(delete yg)    18.  拟声词(1个一类) 

 o 拟声词    19.  前缀(1个一类) 

 h 前缀    20.  后缀(1个一类) 

 k 后缀    21.  字符串(1个一类,2个二类) 

 x 字符串 

  xe Email字符串 

 xs 微博会话分隔符 

 xm 表情符合 

 xu 网址URL 


       22.  标点符号(1个一类,16个二类) 

 w 标点符号 

 wkz 左括号,全角:(  〔  [  {  《  【  〖  〈  半角:( [ { < 

 wky 右括号,全角:)  〕  ]  }  》  】  〗  〉  半角: ) ] { > 

 wyz 左引号,全角:“  ‘  『   

 wyy 右引号,全角:”  ’  』  

 wj 句号,全角:。 

 ww 问号,全角:?  半角:? 

 wt 叹号,全角:!  半角:! 

 wd 逗号,全角:,  半角:, 

 wf 分号,全角:;  半角: ; 

 wn 顿号,全角:、 

 wm 冒号,全角::  半角: : 

 ws 省略号,全角:……  … 

 wp 破折号,全角:——  --  ——-  半角:--- ---- 

 wb 百分号千分号,全角:%  ‰  半角:% 

 wh 单位符号,全角:¥  $  £  °  ℃  半角:$   
        
See More >>


Topic: Penn Chinese Treebank Project

              Penn Chinese Treebank Project  The Penn Chinese Treebank Project  Growing interest in Chinese Language Processing is leading to the development of resources such as annotated corpora and automatic segmenters, part-of-speech taggers and parsers. Currently these are all being developed independently, often with quite different standards for segmentation, part-of-speech tagging and syntactic bracketing. The time is ripe for an open discussion of the methodological issues involved in achieving agreement on annotation standards. 

Unlike Western and Middle Eastern Writing systems, Chinese writing does not have a natural delimiter between words with the result that appropriate word segmentation becomes a prerequisite for any other NLP tasks. In the literature this problem has been discussed extensively. The problem of part-of-speech tagging is closely related. These are both prerequisites to the establishment of a Chinese Treebank that could be of general use. 

We have completed building a 500-thousand-word Chinese Treebank. Our aim is to work towards a community consensus on guidelines that will include the input of influential researchers from Taiwan, Singapore, Hong Kong, China and the US. To this end, we held two workshops and a number of meetings between 7/1998 to 10/2000 in USA and abroad. We are very interested in the community's reaction to our guidelines and Treebank, and encourage anyone interested in getting involved to please look into the guidelines we have attached below, use the Treebank, which is available via LDC, and to get in touch with us with your comments.  Descriptions of the project:  
 Task: Building a segmented, POS tagged and bracketed Chinese corpus. The data consists of Xinhua newswire, Hong Kong news and articles from Sinorama news magazine. 
 Project Status: The Chinese TreeBank (CTB) version 4.0, which has 404K words, has been officially released via Linguistic Data Consortium. CTB 5.0, which will have 507K words, is also in the LDC data release pipeline. It will be available at the end of 2004.  Penn guidelines for Chinese Treebank   
 Segmentation guidelines (final version): [ps-file], [pdf-file] 
 Guideline for POS tagging (final version): [ps-file], [pdf-file] 
 Guideline for Bracketing (final version): [ps-file], [pdf-file]

 All three guidelines are now IRCS technical reports. The ID numbers are 00-06, 00-07 and 00-08, respectively.  Publications   2000:  Developing Guidelines and Ensuring Consistency for Chinese Text Annotation  Fei Xia, Martha Palmer, Nianwen Xue, Mary Ellen Okurowski, John Kovarik, Fu-Dong Chiou, Shizhe Huang, Tony Kroch, and Mitch Marcus 
  Proceedings of the second International Conference on Language Resources and Evaluation (LREC 2000), Athens, Greece, 2000.    2001:  Facilitating Treebank Annotation with a Statistical Parser  Fu-Dong Chiou, David Chiang, and Martha Palmer 
  Proceedings of the Human Language Technology Conference (HLT 2001), San Diego, California, 2001.    2002:  Building a Large-Scale Annotated Chinese Corpus  Nianwen Xue, Fu-Dong Chiou, and Martha Palmer 
  Proceedings of the 19th. International Conference on Computational Linguistics (COLING 2002), Taipei, Taiwan, 2002.    2005: The Penn Chinese TreeBank: Phrase Structure Annotation of a Large Corpus.  Nianwen Xue, Fei Xia, Fu-Dong Chiou, and Martha Palmer 
  Natural Language Engineering, 11(2)207-238.  Personnel   Principal Investigators:  Martha Palmer, Mitch Marcus, Tony Kroch  Consultants:  Shizhe Huang, Mary Ellen Okurowski, John Kovarik, Boyan A. Onyshkevyc  Project Managers:  Shudong Huang (September - December, 1998),  Fei Xia (September 1998 - December 2000),  Nianwen Xue (May 1999 - May 2000),  Fu-Dong Chiou (January 2001 - present)  Guideline Designers: Fei Xia Nianwen Xue  Programming Support:  Zhibiao Wu (September 1998 - September 2000)  Scott Cotton (October - December, 2000)  Annotators:  Meiyu Chang (June 2003 - present)  Fu-Dong Chiou (September 1998 - present)  Shudong Huang (September - December, 1998)  Tsan-Kuang Lee (June 2002 - present)  Nianwen Xue (September 1998 - May 2000; September 2001 - November 2002)   Sample Files   
 file 1: [ps-file], [pdf-file] 
 file 2: [ps-file], [pdf-file]  Treebank Releases on  

 Preliminary Release: June 2000, see the announcement

 Second Release: Dec 2000, see  the announcement

 Workshops and meetings 
  1st CLP Workshop (6-7/98), Philadelphia, USA  
 meeting during ACL-98, Montreal, Canada (8/98) 
 meeting during ICCIP-98, Beijing, China (11/98) 
 meeting during ACL-99, Maryland, USA (6/99) 
  2nd CLP Workshop (10/00), Hong Kong, China  Links to other sites 
  Penn English Treebank Project 
  Penn Korean Treebank Project

  Acknowledgment 

  Last modified on February 10, 2004. This page has been viewed  times since March 5, 2003.   
        
See More >>


Topic: Penn Treebank Constituent Tags

               Penn Treebank II Constituent Tags  

 Note: This information comes from "Bracketing Guidelines for Treebank II Style Penn Treebank Project" - part of the documentation that comes with the Penn Treebank. 

  Contents:   
Bracket Labels  
Clause Level 
Phrase Level 
Word Level  
Function Tags  
Form/function discrepancies 
Grammatical role 
Adverbials 
Miscellaneous  
Index of All Tags    Bracket Labels     Clause Level   

  S - simple declarative clause, i.e. one that is not introduced by a (possible empty) subordinating conjunction or a wh-word and that does not exhibit subject-verb inversion.
   SBAR - Clause introduced by a (possibly empty) subordinating conjunction.
   SBARQ - Direct question introduced by a wh-word or a wh-phrase. Indirect questions and relative clauses should be bracketed as SBAR, not SBARQ.
   SINV - Inverted declarative sentence, i.e. one in which the subject follows the tensed verb or modal.
   SQ - Inverted yes/no question, or main clause of a wh-question, following the wh-phrase in SBARQ.
    Phrase Level    ADJP - Adjective Phrase.
   ADVP - Adverb Phrase.
   CONJP - Conjunction Phrase.
   FRAG - Fragment.
   INTJ - Interjection. Corresponds approximately to the part-of-speech tag UH.
   LST - List marker. Includes surrounding punctuation.
   NAC - Not a Constituent; used to show the scope of certain prenominal modifiers within an NP.
   NP - Noun Phrase. 
   NX - Used within certain complex NPs to mark the head of the NP. Corresponds very roughly to N-bar level but used quite differently.
   PP - Prepositional Phrase.
   PRN - Parenthetical. 
   PRT - Particle. Category for words that should be tagged RP. 
   QP - Quantifier Phrase (i.e. complex measure/amount phrase); used within NP.
   RRC - Reduced Relative Clause. 
   UCP - Unlike Coordinated Phrase. 
   VP - Vereb Phrase. 
   WHADJP - Wh-adjective Phrase. Adjectival phrase containing a wh-adverb, as in how hot.
   WHAVP - Wh-adverb Phrase. Introduces a clause with an NP gap. May be null (containing the 0 complementizer) or lexical, containing a wh-adverb such as how or why.
   WHNP - Wh-noun Phrase. Introduces a clause with an NP gap. May be null (containing the 0 complementizer) or lexical, containing some wh-word, e.g. who, which book, whose daughter, none of which, or how many leopards.
   WHPP - Wh-prepositional Phrase. Prepositional phrase containing a wh-noun phrase (such as of which or by whose authority) that either introduces a PP gap or is contained by a WHNP.
   X - Unknown, uncertain, or unbracketable. X is often used for bracketing typos and in bracketing the...the-constructions.
    Word level    CC - Coordinating conjunction
   CD - Cardinal number
   DT - Determiner
   EX - Existential there
   FW - Foreign word
   IN - Preposition or subordinating conjunction
   JJ - Adjective
   JJR - Adjective, comparative
   JJS - Adjective, superlative
   LS - List item marker
   MD - Modal
   NN - Noun, singular or mass
   NNS - Noun, plural
   NNP - Proper noun, singular
   NNPS - Proper noun, plural
   PDT - Predeterminer
   POS - Possessive ending
   PRP - Personal pronoun
   PRP$ - Possessive pronoun (prolog version PRP-S)
   RB - Adverb
   RBR - Adverb, comparative
   RBS - Adverb, superlative
   RP - Particle
   SYM - Symbol
   TO - to
   UH - Interjection
   VB - Verb, base form
   VBD - Verb, past tense
   VBG - Verb, gerund or present participle
   VBN - Verb, past participle
   VBP - Verb, non-3rd person singular present
   VBZ - Verb, 3rd person singular present
   WDT - Wh-determiner
   WP - Wh-pronoun
   WP$ - Possessive wh-pronoun (prolog version WP-S)
   WRB - Wh-adverb
    Function tags     Form/function discrepancies    -ADV (adverbial) - marks a constituent other than ADVP or PP when it is used adverbially (e.g. NPs or free ("headless" relatives). However, constituents that themselves are modifying an ADVP generally do not get -ADV. If a more specific tag is available (for example, -TMP) then it is used alone and -ADV is implied. See the Adverbials section.
   -NOM (nominal) - marks free ("headless") relatives and gerunds when they act nominally.
    Grammatical role    -DTV (dative) - marks the dative object in the unshifted form of the double object construction. If the preposition introducing the "dative" object is for, it is considered benefactive (-BNF). -DTV (and -BNF) is only used after verbs that can undergo dative shift.
   -LGS (logical subject) - is used to mark the logical subject in passives. It attaches to the NP object of by and not to the PP node itself.
   -PRD (predicate) - marks any predicate that is not VP. In the do so construction, the so is annotated as a predicate.
  -PUT - marks the locative complement of put. 
   -SBJ (surface subject) - marks the structural surface subject of both matrix and embedded clauses, including those with null subjects.
   -TPC ("topicalized") - marks elements that appear before the subject in a declarative sentence, but in two cases only:  
if the front element is associated with a *T* in the position of the gap. 
if the fronted element is left-dislocated (i.e. it is associated with a resumptive pronoun in the position of the gap).    -VOC (vocative) - marks nouns of address, regardless of their position in the sentence. It is not coindexed to the subject and not get -TPC when it is sentence-initial.
    Adverbials   

 Adverbials are generally VP adjuncts.

  -BNF (benefactive) - marks the beneficiary of an action (attaches to NP or PP). 
 This tag is used only when (1) the verb can undergo dative shift and (2) the prepositional variant (with the same meaning) uses for. The prepositional objects of dative-shifting verbs with other prepositions than for (such as to or of) are annotated -DTV.
   -DIR (direction) - marks adverbials that answer the questions "from where?" and "to where?" It implies motion, which can be metaphorical as in "...rose 5 pts. to 57-1/2" or "increased 70% to 5.8 billion yen" -DIR is most often used with verbs of motion/transit and financial verbs.
   -EXT (extent) - marks adverbial phrases that describe the spatial extent of an activity. -EXT was incorporated primarily for cases of movement in financial space, but is also used in analogous situations elsewhere. Obligatory complements do not receive -EXT. Words such as fully and completely are absolutes and do not receive -EXT. 
   -LOC (locative) - marks adverbials that indicate place/setting of the event. -LOC may also indicate metaphorical location. There is likely to be some varation in the use of -LOC due to differing annotator interpretations. In cases where the annotator is faced with a choice between -LOC or -TMP, the default is -LOC. In cases involving SBAR, SBAR should not receive -LOC. -LOC has some uses that are not adverbial, such as with place names that are adjoined to other NPs and NAC-LOC premodifiers of NPs. The special tag -PUT is used for the locative argument of put.
   -MNR (manner) - marks adverbials that indicate manner, including instrument phrases.
   -PRP (purpose or reason) - marks purpose or reason clauses and PPs.
   -TMP (temporal) - marks temporal or aspectual adverbials that answer the questions when, how often, or how long. It has some uses that are not strictly adverbial, auch as with dates that modify other NPs at S- or VP-level. In cases of apposition involving SBAR, the SBAR should not be labeled -TMP. Only in "financialspeak," and only when the dominating PP is a PP-DIR, may temporal modifiers be put at PP object level. Note that -TMP is not used in possessive phrases. 
    Miscellaneous    -CLR (closely related) - marks constituents that occupy some middle ground between arguments and adjunct of the verb phrase. These roughly correspond to "predication adjuncts", prepositional ditransitives, and some "phrasel verbs". Although constituents marked with -CLR are not strictly speaking complements, they are treated as complements whenever it makes a bracketing difference. The precise meaning of -CLR depends somewhat on the category of the phrase.  
on S or SBAR - These categories are usually arguments, so the -CLR tag indicates that the clause is more adverbial than normal clausal arguments. The most common case is the infinitival semi-complement of use, but there are a variety of other cases. 
on PP, ADVP, SBAR-PRP, etc - On categories that are ordinarily interpreted as (adjunct) adverbials, -CLR indicates a somewhat closer relationship to the verb. For example:  
Prepositional Ditransitives
 In order to ensure consistency, the Treebank recognizes only a limited class of verbs that take more than one complement (-DTV and -PUT and Small Clauses) Verbs that fall outside these classes (including most of the prepositional ditransitive verbs in class [D2]) are often associated with -CLR. 
Phrasal verbs
 Phrasal verbs are also annotated with -CLR or a combination of -PRT and PP-CLR. Words that are considered borderline between particle and adverb are often bracketed with ADVP-CLR. 
Predication Adjuncts
 Many of Quirk's predication adjuncts are annotated with -CLR.  
on NP - To the extent that -CLR is used on NPs, it indicates that the NP is part of some kind of "fixed phrase" or expression, such as take care of. Variation is more likely for NPs than for other uses of -CLR.    -CLF (cleft) - marks it-clefts ("true clefts") and may be added to the labels S, SINV, or SQ.
   -HLN (headline) - marks headlines and datelines. Note that headlines and datelines always constitute a unit of text that is structurally independent from the following sentence.
   -TTL (title) - is attached to the top node of a title when this title appears inside running text. -TTL implies -NOM. The internal structure of the title is bracketed as usual.
    Index of All Tags    ADJP  
  -ADV  
  ADVP  
  -BNF  
  CC  
  CD  
  -CLF  
  -CLR  
  CONJP  
  -DIR  
  DT  
  -DTV  
  EX  
  -EXT  
  FRAG  
  FW  
  -HLN  
  IN  
  INTJ  
  JJ  
  JJR  
  JJS  
  -LGS  
  -LOC  
  LS  
  LST  
  MD  
  -MNR  
  NAC  
  NN  
  NNS  
  NNP  
  NNPS  
  -NOM  
  NP  
  NX  
  PDT  
  POS  
  PP  
  -PRD  
  PRN  
  PRP  
  -PRP  
  PRP$ or PRP-S  
  PRT  
  -PUT  
  QP  
  RB  
  RBR  
  RBS  
  RP  
  RRC  
  S  
  SBAR  
  SBARQ  
  -SBJ  
  SINV  
  SQ  
  SYM  
  -TMP  
  TO  
  -TPC  
  -TTL  
  UCP  
  UH  
  VB  
  VBD  
  VBG  
  VBN  
  VBP  
  VBZ  
  -VOC  
  VP  
  WDT  
  WHADJP  
  WHADVP  
  WHNP  
  WHPP  
  WP  
  WP$ or WP-S  
  WRB  
  X   
        
See More >>


Topic: Stanford CoreNLP API

                

 Using the Stanford CoreNLP API 

 The backbone of the CoreNLP package is formed by two classes: Annotation and Annotator. Annotations are the data structure which hold the results of annotators. Annotations are basically maps, from keys to bits of the annotation, such as the parse, the part-of-speech tags, or named entity tags. Annotators are a lot like functions, except that they operate over Annotations instead of Objects. They do things like tokenize, parse, or NER tag sentences. Annotators and Annotations are integrated by AnnotationPipelines, which create sequences of generic Annotators. Stanford CoreNLP inherits from the AnnotationPipeline class, and is customized with NLP Annotators. 

 The table below summarizes the Annotators currently supported and the Annotations that they generate. 


	 Property name 	 Annotator class name 	 Generated Annotation 	 Description  

	 tokenize 	 TokenizerAnnotator 	 TokensAnnotation (list of tokens), and CharacterOffsetBeginAnnotation, CharacterOffsetEndAnnotation, TextAnnotation (for each token) 	 Tokenizes the text. This component started as a PTB-style tokenizer, but was extended since then to handle noisy and web text. The tokenizer saves the character offsets of each token in the input text, as CharacterOffsetBeginAnnotation and CharacterOffsetEndAnnotation.  

	 cleanxml 	 CleanXmlAnnotator 	 XmlContextAnnotation 	 Remove xml tokens from the document  

	 ssplit 	 WordToSentenceAnnotator 	 SentencesAnnotation 	 Splits a sequence of tokens into sentences.  

	 pos 	 POSTaggerAnnotator 	 PartOfSpeechAnnotation 	 Labels tokens with their POS tag. For more details seethis page.  

	 lemma 	 MorphaAnnotator 	 LemmaAnnotation 	 Generates the word lemmas for all tokens in the corpus.  

	 ner 	 NERClassifierCombiner 	 NamedEntityTagAnnotation and NormalizedNamedEntityTagAnnotation 	 Recognizes named (PERSON, LOCATION, ORGANIZATION, MISC), numerical (MONEY, NUMBER, ORDINAL, PERCENT), and temporal (DATE, TIME, DURATION, SET) entities. Named entities are recognized using a combination of three CRF sequence taggers trained on various corpora, such as ACE and MUC. Numerical entities are recognized using a rule-based system. Numerical entities that require normalization, e.g., dates, are normalized to NormalizedNamedEntityTagAnnotation. For more details on the CRF tagger seethis page.  

	 regexner 	 RegexNERAnnotator 	 NamedEntityTagAnnotation 	 Implements a simple, rule-based NER over token sequences using Java regular expressions. The goal of this Annotator is to provide a simple framework to incorporate NE labels that are not annotated in traditional NL corpora. For example, the default list of regular expressions that we distribute in the models file recognizes ideologies (IDEOLOGY), nationalities (NATIONALITY), religions (RELIGION), and titles (TITLE). Here isa simple exampleof how to use RegexNER. For more complex applications, you might considerTokensRegex.  

	 sentiment 	 SentimentAnnotator 	 SentimentCoreAnnotations.AnnotatedTree 	 Implements Socher et als sentiment model. Attaches a binarized tree of the sentence to the sentence level CoreMap. The nodes of the tree then contain the annotations from RNNCoreAnnotations indicating the predicted class and scores for that subtree. See thesentiment pagefor more information about this project.  

	 truecase 	 TrueCaseAnnotator 	 TrueCaseAnnotation and TrueCaseTextAnnotation 	 Recognizes the true case of tokens in text where this information was lost, e.g., all upper case text. This is implemented with a discriminative model implemented using a CRF sequence tagger. The true case label, e.g., INIT_UPPER is saved in TrueCaseAnnotation. The token text adjusted to match its true case is saved as TrueCaseTextAnnotation.  

	 parse 	 ParserAnnotator 	 TreeAnnotation, BasicDependenciesAnnotation, CollapsedDependenciesAnnotation, CollapsedCCProcessedDependenciesAnnotation 	 Provides full syntactic analysis, using both the constituent and the dependency representations. The constituent-based output is saved in TreeAnnotation. We generate three dependency-based outputs, as follows: basic, uncollapsed dependencies, saved in BasicDependenciesAnnotation; collapsed dependencies saved in CollapsedDependenciesAnnotation; and collapsed dependencies with processed coordinations, in CollapsedCCProcessedDependenciesAnnotation. Most users of our parser will prefer the latter representation. For more details on the parser, please seethis page. For more details about the dependencies, please refer tothis page.  

	 depparse 	 DependencyParseAnnotator 	 BasicDependenciesAnnotation, CollapsedDependenciesAnnotation, CollapsedCCProcessedDependenciesAnnotation 	 Provides a fast syntactic dependency parser. We generate three dependency-based outputs, as follows: basic, uncollapsed dependencies, saved in BasicDependenciesAnnotation; collapsed dependencies saved in CollapsedDependenciesAnnotation; and collapsed dependencies with processed coordinations, in CollapsedCCProcessedDependenciesAnnotation. Most users of our parser will prefer the latter representation. For details about the dependency software, seethis page. For more details about dependency parsing in general, seethis page.  

	 dcoref 	 DeterministicCorefAnnotator 	 CorefChainAnnotation 	 Implements both pronominal and nominal coreference resolution. The entire coreference graph (with head words of mentions as nodes) is saved in CorefChainAnnotation. For more details on the underlying coreference resolution algorithm, seethis page.  

	 relation 	 RelationExtractorAnnotator 	 MachineReadingAnnotations.RelationMentionsAnnotation 	 Stanford relation extractor is a Java implementation to find relations between two entities. The current relation extraction model is trained on the relation types (except the kill relation) and data from the paper Roth and Yih, Global inference for entity and relation identification via a linear programming formulation, 2007, except instead of using the gold NER tags, we used the NER tags predicted by Stanford NER classifier to improve generalization. The default model predicts relationsLive_In,Located_In,OrgBased_In,Work_For, andNone. For more details of how to use and train your own model, seethis page.  

	 natlog 	 NaturalLogicAnnotator 	 OperatorAnnotation, PolarityAnnotation 	 Marks quantifier scope and token polarity, according to natural logic semantics. Places an OperatorAnnotation on tokens which are quantifiers (or other natural logic operators), and a PolarityAnnotation on all tokens in the sentence.  

	 quote 	 QuoteAnnotator 	 QuotationAnnotation 	 Deterministically picks out quotes delimited by “ or ‘ from a text. All top-level quotes, are supplied by the top level annotation for a text. If a QuotationAnnotation corresponds to a quote that contains embedded quotes, these quotes will appear as embedded QuotationAnnotations that can be accessed from the QuotationAnnotation that they are embedded in. The QuoteAnnotator can handle multi-line and cross-paragraph quotes, but any embedded quotes must be delimited by a different kind of quotation mark than its parents. Does not depend on any other annotators. Support for unicode quotes is not yet present.  

	 entitymentions 	 EntityMentionsAnnotator 	 MentionsAnnotation 	 Provides a list of the mentions identified by NER (including their spans, NER tag, normalized value, and time). As an instance, New York City will be identified as one mention spanning three tokens.      
        
See More >>


Topic: 《现代汉语语法信息词典》的开发与应用

              《现代汉语语法信息词典》的开发与应用 


 【摘要】现代汉语语法信息词典是为计算机实现汉语句子的自动分析与自动生成开发的一部机器词典,它以数据库文件形式收录了5万多条现代汉语的词语,不仅给出了每个词语所属的词类,而且详细描述了它们的各种语法属性。本文介绍这部语法词典的开发历程、内容概要和设计思想,并且举例说明在自然语言处理系统中如何应用这部语法词典。 


 关键词:现代汉语、语法信息词典、机器词典、自然语言处理 


 The Development of Contemporary Chinese Grammatical 

 Knowledge Base and its Applications 

 ZHU Xuefeng YU Shiwen WANG Hui 

 Institute of Computational Linguistics, Peking University 

 Beijing 100871, P.R.C 

 Phone :2501892 


 Abstract 

 The Contemporary Chinese Grammatical Knowledge Base is a machine dictionary,which is developed for automatic analysis and generation of Chinese sentences. There are about 50,000 Chinese words and idioms in the knowledge base represented by database files. The knowledge base not only gives part of speech for each word or idiom, but also describes their various grammatical attributes. The paper introduces the design, the development and the outline of the knowledge base and shows its applications in natural language processing systems with examples. 


 Keywods: contemporary Chinese, grammatical knowledge base, machine dictionary, 

  natural language processing 


 1. 现代汉语语法信息词典的开发历程 

  十年前,中文输入技术的主流还是汉字编码,以词为单位进行输入也只是汉字输入的陪衬。北大计算语言学研究所在1986年提出了一个语法规则制导的以语句为单位的中文输入方案,并在一年多的时间内实现了。参考文献[1]深入浅出地介绍了这个方案的原理与实现技术。这个方法中就包含了一部电子词典,除了词条及每个词的检索特征(拼音、起笔、末笔等)外,还包括词类及细分的子类。这部词典成为现代汉语语法信息词典的基础。 

  作为中国七五攻关项目“自然语言理解与人机接口”中的一个子专题,俞士汶于1987年提出了开发“现代汉语词语语法信息库”的计划[2] ,把研究重点放在词语语法属性的描述上。恰逢此时,中国著名语言学家朱德熙先生承担了全国社会科学规划领导小组下达的“现代汉语词类研究”的攻关项目。从此,北大计算语言学研究所与中文系的研究者们在朱德熙先生的率领下开始了联合攻关,并结成了稳定的合作关系。1990年,“现代汉语词语语法信息库”取得了阶段性成果,通过技术鉴定。 

  在讨论八五攻关项目时,以中国工程院院士、中国中文信息学会理事长陈力为教授为代表的中国一批自然语言处理技术专家敏锐地觉察到,为了中文信息处理技术的发展,特别是语言信息处理技术的发展,有必要建立通用的应用开发平台[3][4]。这个大型语言工程将现代汉语语法信息词典(以下有时简称为“语法词典”)列为它的一个子专题。从1991年起北大计算语言学研究所承担了这个子专题的研制任务。本项研究继承了“现代汉语词语语法信息库”的成果,又经过5年的努力,现在本项研究已完成如下任务:(1)制订了现代汉语语法信息词典的规格说明书与开发方略[5];(2)建立了面向信息处理的现代汉语词语分类体系并完成了关于这个分类体系的研究报告[6];(3)明确了词语的收录范围与选词原则[7];(4)探讨了某些词类的子类划分[8];(5)语法词典本身的开发,这当然是最繁重、最艰巨的任务。到目前为止,语法词典收录的词语总数为5万多条,并且将这5万多词都归了类,按照规格说明书填入了语法属性信息,其中百分之七十经过了仔细的、多遍的、不同角度的校对。 

  按照应用开发平台工程总体组的布署,北大已将语法词典的部分内容提交给其他子专题开发组使用。最近,负责句法规则的研究者告知,语法词典对句法分析提供的语法知识是有价值的,也是相当充分的。对于开发者来说,这当然是莫大的安慰与鼓励。另外,北大计算语言学研究所与中国科学院计算所联合开发“汉英机器翻译模型系统”,与北京通字公司联合开发“面向通用图像码的自然语言生成系统”,与自然科学基金项目配合,开发汉语语料库多级标注系统[9],这些应用系统利用了语法词典的信息。语法词典为这些应用系统取得阶段性成果也作出了贡献。 

  总之,现代汉语语法信息词典的开发已取得阶段性成果,并且在若干自然语言处理应用系统开发中得到了利用。 


 2. 现代汉语语法信息词典的内容概要 

 2.1 词语的分类 

 词语的分类既是任何一个自然语言处理系统的基础也是语法信息词典开发的基础。因为语法词典既要描述每类词都有的共同的语法属性,又要分别描述各类词特有的语法属性,只有这样,语法信息才会充分、完备,而又不致过于冗余。语法词典的词类体系是在朱德熙先生的语法理论指导下,依据词的语法功能建立的,现代汉语词语可划分为以下18个基本词类: 

 名 词(n) 如:书、水、教授、国家、心胸、北京 

 时间词(t) 如:明天、元旦、唐朝、现在、春天 

 处所词(s) 如:空中、低处、郊外、隔壁 

 方位词(f) 如:上、下、前、后、东、西、南、北、里面、外头、中间 

 数词(m) 如:一、第一、千、零、许多、分之 

 量 词(q) 如:个、群、公斤、杯、片、种、些 

 区别词(b) 如:男、女、公共、微型、初级 

 代 词(r) 如:你、我们、这、那么、哪儿、谁 

 动 词(v) 如:走、休息、同意、能够、出去、是、调查 

 形容词(a) 如:好、红、大、温柔、美丽、突然 

 状态词(z) 如:雪白、金黄、泪汪汪、满满当当、灰不溜秋 

 副词(d) 如:不、很、都、刚刚、难道、忽然 

 介 词(p) 如:把、被、对于、关于、以、按照 

 连 词(c) 如:和、与、或、虽然、但是、否则 

 助 词(u) 如:了、着、过、的、所、似的 

 语气词(y) 如:吗、呢、吧、嘛、啦、呗 

 拟声词(o) 如:呜、啪、叮呤当啷、哗啦 

 叹 词(e) 如:唉、喔、哎哟、嗯、啊 

 括号中的英文字母是各个词类的代码。这18个基本词类是被多数语言学家认可的。其中名词、时间词、处所词、方位词、数词、量词可以归并为体词(其主要语法功能是作主语、宾语),动词、形容词、状态词可以归并为谓词(其主要语法功能是作谓语),代词有一部分属于体词(如:你、我、这儿、哪里等),又有一部分属于谓词(如:这样、那么、怎么样等)。体词、谓词、区别词、副词又合称为实词,而介词、连词、助词、语气词合称为虚词。 

 在实际文本中出现的词语,除了属于以上18个基本词类的以外,还存在比基本词类要大的单位,如: 

 成 语(i) 如:空中楼阁、画龙点睛、字字珠玑、一衣带水 

 习用语(l) 如:总而言之、自古以来、跑龙套、摆花架子 

 简称略语(j) 如:北大、数理化、总参、三好、农牧业 

 也存在比基本词类更小的单位,如: 

 前接成分(h) 如:阿(~妹)、老(~张)、伪(~指令) 

 后接成分(k) 如:子(桌~)、儿(花~)、头(石~)、式、员 

 语 素 字(g) 如:碧、棉、宾、洁、农、怒 

 非语素字(x) 如:鸳、鸯、葡、萄、咖、啡 

 中文的标点符号(w) 如:。,《》 、!“” 

 为了分析实际文本的需要,现代汉语词语功能分类体系共包括了26个不同的词语类别。 

 现在已完成了语法词典收录的5万词语的归类工作。 

 2.2 语法词典的结构与形态 

 语法词典采用成熟的关系数据库技术,结合使用分类与属性描述两种方法,对5万词语建立了分级的语法属性库。每一个库文件都刻划了词语及其属性的二维关系。长期以来,自然语言处理技术都是应用规则系统描述语言的语法规律。这种规则系统抽象程度高,适合于描述词类与词类之间的组合关系。但是自然语言极其复杂,每个词语都有自己的特性,规则系统是难以应付大范围的实际语料的复杂性的。面向实际语料中词与词的同现关系的统计学研究是一个有前途的新方向,但统计的数据量非常大,需要强大的计算机系统甚至超并行计算机系统的支持。语法词典介于上述两种办法之间,是在应用需求与客观条件之间进行权衡与折衷的实际可行的策略。 

 词典中共有32个数据库文件。总库1个。各类词库24个(叹词、象声词、非语素字现未另建库)。代词库下又设两个库,即人称代词、指示 / 疑问代词分库,动词库下又设体宾动词、谓宾动词、双宾动词、动结式、动趋式、离合词等6个分库。 

 所有词的共同属性容纳在总库中,总库中的属性包括读音、词类、切分标记、姓氏标记等,共计约20项。各类词的特有属性填在各类词的库中。以动词为例,动词库中列出了46项属性,表1是动词属性库中部分属性的样例。 

 表1. 动词属性库中部分属性的样例   


 词语  	

 同形  	

 义项  	

 助动  	

 外内  	

 体谓准  	

 双宾  	

 着了过  	

 重叠  	

 VVO  	

 离合  	

 单作谓语  	

 单作补语  	

 兼类   


 交给  	

 体  	

 双  	

 了  	



 理发  	



 内  	

 了过  	


 VVO  	

 离  	

 可  	



 会  	

 A  	

 见面  	

 体  	


 着了过  	

 VV  	

 n   


 会  	

 B1  	

 理解  	

 体  	



 可  	

 可  	

 会  	

 B2  	

 可能  	

 助  	


 谓  	



 可  	



 会  	

 C  	

 付帐  	

 体  	



 可  	



 加强  	

 体准  	


 了  	



 进行  	

 准  	


 了  	



 能够  	

 助  	


 谓  	



 可  	



 保管  	

 1  	

 保存  	

 体  	


 着了过  	

 ABAB  	

 可  	



 保管  	

 2  	

 担保  	

 谓  	




 帮  	


 帮助  	

 体  	

 双  	

 着了过  	

 VV  	

 可  	


 q   


 冒险  	



 内  	

 过  	


 VVO  	

 离  	

 a   


 上去  	



 内  	

 了过  	

 离  	

 可  	

 可  	

 对动词的某些属性(如体词宾语、谓词宾语的类型)还要进一步刻划,则分别建立有关的分库。这样,整个信息库形成了层次构造的体系。 

 总库与各类词库,代词与下属的2个分库,动词与下属的6个分库都可以进行连结(JOIN),连接条件可以用词语、词类、同形这些字段来表达。这样,这32个库文件构成有上下位继承关系的“树”,子结点继承父结点的全部信息,或者说,将父结点与子结点连结起来就可以得到词语的更全面的信息。 

 2.3 词语的属性描写 

 分类法刻划事物虽然简洁、清晰、信息密度大,但属于同一类的事物仍可能各具特点,例如“鱼”和“牛”同属个体名词,因为“鱼”有专用个体量词“尾”,“牛”有专用个体量词“头”。但是,“鱼”通常还可以与度量词“斤,克”搭配,“牛”就不行。因此语法词典更依靠属性描述来刻划每一个词语的语法信息。如对于名词,就详细描述每个名词可以搭配的各类量词。 

 语法词典对每一类词的语法属性进行了相当充分的发掘。例如,对于作为研究重点的动词共确定了46项属性。这些属性大致可归纳为7类。第一类是关于动词本身特性的,如该动词是不是系词、助动词、趋向动词。第二类是关于动词变化形态的,如有没有VV、ABAB、AABB、V一V、V了V等形态。第三类描述该动词有无名词特性,如能否直接修饰名词,能否直接受名词修饰、能否作动词“有”的宾语等。第四类反映该动词同一些虚词的关系,如它前面能不能受“不,没,很”修饰,后面能不能带“着,了,过”。第五类描述动词在句中的功能,即该动词在句法结构中能否单独作主语、谓语、宾语、状语和补语,其中能否单独作谓语是一项很重要的属性。第六类刻划动词与后继成分的关系,即该动词能否后接表示结果的补语,能否后接趋向动词,能否后接时量成分,能否后接动量成分,能否带宾语。如果能带宾语,则进一步细分能带什么样的宾语:体词,谓词,双宾等。第七类包含其它零散的属性,如该动词的主语是否必须是“复数”。 


 3. 现代汉语语法信息词典的设计思想 

 3.1 通用与专用相结合,以通用为主 

 在自然语言处理系统中,通常都有一部包括词法、句法、语义信息的机器词典,但由于这类词典是服务于特定目的与特定系统的,为了把它从一个系统移植到另一个系统时需要花费很大力气,人们往往宁愿另起炉灶。本语法词典作为中文信息处理技术应用开发平台的一个组成部分,是独立于特定的处理系统的,甚至也不依赖于某个具体的计算语言学理论与算法,它反映的是现代汉语词语的语法功能的基本事实。各个具体的应用系统可能不需要语法词典所包含的全部知识,但都可以对它进行裁剪或从中提取出所需要的知识。语法词典的收词原则、各个词的义项的选取原则以及语法属性的确定都是面向通用的现代汉语的。但是,当将语法词典应用于具体系统时,也可以通过词语的选取、属性的增删向各个具体系统倾斜,专用的色彩就会变浓。 


 3.2 专家知识与语料库相结合,以专家知识为主 

 现代汉语词语分类体系的确立、若干词类的子类的划分、各类词的共同语法属性(总库)与特殊属性(分库)的设置以及属性值的确定主要依赖专家的知识。指导、主持与参与语法词典开发的专家或者是造诣颇深的著名语言学家,或者是在开发具体的自然语言处理系统中积累了丰富感性知识的计算机专家,或者是基础扎实文理结合的青年计算语言学工作者。语法词典就是将这些专家的知识以形式化、规格化的方式存储到计算机系统中。而且语法词典的开发也为计算机科学与语言学的结合找到了一个合适的途径。计算机系统可以较快地吸收语言学家的知识,语言学家也能比较容易地利用语法词典开展语言研究与语言教学研究。 

 在依赖专家知识的同时,我们也重视语料库的建设。对总体组提供的3批语料,我们参与了切分与词性标注。北大计算语言学研究所还建立了面向语法研究的语料库,并对其中一部分(约70万字)进行了切分与标注。利用这些语料,可对词典内容进行比较与校对,从而大大提高了词典内容的可信度。 


 3.3 基础研究与应用研究相结合,以基础研究为主 

 北大计算语言学研究所在八五期间始终将语法词典的开发列为工作的重点,尤其是课题组的主要成员,更是全身心地投入了这项开发工作,以全局利益和长远利益为重,坚持做底层的基础的工作。 

 北大计算语言学研究所也在另外一些项目中使用语法词典的成果。这些项目包括独立开发的现代汉语语料库多级标注系统CCMP[9],也包括与其它单位合作开发的如1.中所述的应用系统。从应用中得到的反馈意见既使课题组得到鼓舞,也使课题组清醒地认识到,要使这项成果早日问世,发挥作用,尚有很多艰苦的工作要做。 


 4. 现代汉语语法词典应用例解 


 语法词典是语言信息处理的基础,它不仅可以在语言信息处理的各个项目(如:机器翻译,自然语言接口,文献检索,语音识别,语音合成,文字识别,中文键盘输入,文本校对,语料库加工等)中得到应用,而且也可以在传统的语言学研究特别是现代汉语语法研究中得到应用。下面以实例解释如何运用这部语法词典。 


 4.1 句法分析 

 按照当前的主流技术,句法分析是机器翻译与自然语言理解等系统的处理流程中的一个必要的环节。句法分析指的是依据某种句法分析理论提供的规则分析自然语言的句子,得到这个句子的句法树(如上下文无关语法CFG)或以复杂特征集表示的功能结构(如词汇功能语法LFG)。要进行这种句法分析,必须要知道每个词的词性(即该词所属的词类, part of speech)。但仅仅依靠词性,会产生大量的歧义结构。如: 

  我们 选举 他 当 主席。 (1) 

  我们 认为 他 是 主席。 (2) 

 (1)与(2)的相似是明显的,从词性来看,它们都有如(3)所示的同样的词类序列。 

  r v r v n (3) 

 根据上下文无关的语法规则,这样的词类序列可以产生多种句法树。从语法词典中查“选举”,这个动词可以后接兼语结构,(1)的结构可以优选为图1中的左边的树。从语法词典中查“认为”,这个动词只能带谓词性宾语,且这个谓词性宾语是一个子句,(2)的结构只可能是图1中的右边的树。 


  S S 


 NP VP NP VP 


 r v NP VP r v SC 

  r v n NP VP 


 我们 选举 他 当 主席 r v n 


  我们 认为 他 是 主席 


 图1 句子(1)与(2)的句法树 


  在机器翻译系统中,只有得到了源语言句子的正确的句法结构,才有可能产生可信度与可读性皆好的目标语言的句子。 


 4.2 句子生成 

 一般地说,在自然语言处理系统中,汉语的句子生成相对说来要简单些,这是因为汉语的词没有复杂的形态变化,词序又比较灵活。以汉语为母语的人容易从词语、语素排列串中猜出它们要表达的意思。正因为如此,现在对汉语的句子生成投入的力量是不够的。自然语言处理系统生成的汉语句子往往带有“机器味儿”,不像地道的汉语。例如,机器翻译系统给出以下两句汉语是寻常的。 

  她是一个美丽姑娘。 (4) 

  当时敌机轰炸着这个城市。 (5) 

 “美丽”是形容词,“美丽”修饰“姑娘”在语义上也是适配的,但读起来总觉得有些别扭。这是因为汉语中的形容词,只有一部分可以直接修饰名词,相当多的一部分需要加助词“的”才能修饰名词。从语法词典的形容词库中,可以查到,“美丽”需加“的”,而它的同义词“漂亮”则不需要加“的”。只要利用这些平凡的知识,则能生成更自然的句子“她是一个美丽的姑娘”或“她是一个漂亮姑娘”。对于(5),之所以觉得它不地道,是因为“轰炸”这个动词后面不能接动态助词“着”,为了表示进行时态,可以改为“当时敌机正在轰炸这个城市”。在语法词典中确实包含了动词“轰炸”不能带“着”、可以受“正在”修饰的信息。 


 4.3 语音识别与拼音汉字转换 

 语音识别通常分为两个阶段。第一阶段是将无编码的语音信号转换为机内的汉语拼音序列,这是模式识别的任务。第二阶段是分化同音字或同音词,表现在书面上则是将拼音序列转换为汉字序列。这是语言信息处理的任务。采用拼音方式从键盘上输入中文所要解决的问题也是拼音序列到汉字序列的转换。假定,给定拼音序列 

  Zhuo1zi5 shang4 you3 yi1 jin1 pi2pa5。 (6) 

 这里,全拼音节后的数字1,2,3,4,5分别代表阴平、阳平、上声、去声、轻声。由于“pi2pa5”对应两个同音词“琵琶”和“枇杷”,某些系统转换出 

  桌子上有一斤琵琶 。 (7) 

 是不足为怪的。但如果利用语法词典,则可以查到每一个具体的名词可能与哪些子类的量词以及哪些具体的量词相适配。与“琵琶”相适配的只有个体量词“把”,而“枇杷”却是可以与度量词“斤”相适配的。根据语法词典提供的这些信息,系统就可以修正(7),从而得到“桌子上有一斤枇杷”。 

 又假定系统已确认对应“jiayi”的词是“加以”,接着输入“yanjiu”。没有更多的信息,系统很难判定对应“yanjiu”的是“烟酒”还是“研究”。如果利用语法词典,则知道“加以”是形式动词,只能带准谓词性宾语,不会带体词性宾语,因此在“加以”的制约下,对应“yanjiu”的只能是“研究”而不会是“烟酒”。 


 4.4 汉字识别的后校正 

 现在脱机(off-line)汉字识别技术对“师”这个模式通常给出“师、怖、帅”等若干个候选字。如果没有上下文,孤立地决定选取哪一个字是困难的。但如果在上下文“三个师的士兵”中,“师”的前后都是笔划较少、较易辨认的字,并且已经唯一地确定下来了,则只有“师”这个名词可以与个体量词“个”相适配。在现代汉语中,“帅”与“怖”只是语素,不能独立成词,一般不会与“个”相适配。因此,系统就会很有信心地从3个候选字中选择“师”。 


 4.5 语料库标注 

 北大计算语言学研究所开发汉语语料库多级加工系统CCMP的经验表明,进行语料库标注,采取基于规则的方法与基于统计的方法相结合的策略是恰当的,并且切分与标注同步进行是合理的[9]。在进行这种标注时,语法词典可以发挥重要的作用。词典中的数以万计的词都已经划好了类,对标注的正确性与一致性可以起到基本的保证作用。标注程序只需集中力量解决兼类词的歧义消解及未登录词的确认与词性判定。 

 利用纯粹的统计方法进行词类标注,也需要有人先对一部分语料进行手工标注(即对系统进行训练)。由于存在不同的语法体系,由于不同的人会有不同的认识,即使同一个人的认识也会发展变化,所以由人直接标注语料难免出现不一致性。例如,对于主宾语位置上的谓词(动词、形容词等)有可能被标为谓词,也有可能被标为名词。这样就会影响自动标注的正确率。依靠这部语法词典,就不会发生这种情况。而且,标注了词性的语料库与语法信息词典相结合,可以构成立体的知识库,即从语料中的词/词性入口,可以迅速检索到该词的诸多语法特性,从而为进一步的分析或标注提供丰富的知识。 


 5. 后记 

 本项研究虽然已取得了可观的阶段性成果,但要做的工作还很多。本课题组决心持之以恒,继续推进这项研究。本项研究自始至终是在陈力为院士的关心与支持下进行的。总体组的多位专家(如袁琦、董振东、黄昌宁等)及各合作单位都给过北大计算语言学研究所多种形式的支持与鼓励。在此一并致以衷心的谢意。 

 陆俭明与郭锐两位先生起了重要的顾问作用。北大计算语言学研究所的张芸芸、郭涛、周强、陶晓鹏、詹卫东、周莉娜等人或者是在词典本身的开发中或者是在词典的应用中都为本项研究贡献了力量。 


 参考文献 

 [1] 俞士汶,中文输入中语法分析技术的应用,《中文信息学报》,1988年,第二卷,第三期,PP20-26 

 [2] 俞士汶,现代汉语词语信息库开发工作介绍提纲,《计算机开发与应用》,1989年,第五卷第二期, PP16-18 

 [3] 陈力为,建立应用开发平台是中文信息处理技术的当务之急,《计算机世界》,1992年1月8日,第2期第5版 

 [4] Zhengdong Dong , 中国中文信息处理平台工程项目与汉语研究,Communications of COLIPS, Volume 3, Number 2, 79~88,1993. 

 [5] 俞士汶,朱学锋,郭锐,现代汉语语法电子词典的概要与设计,第三届中文信息处理国际会议论文集(ICCIP’92),PP186-191 

 [6] 俞士汶,信息处理用现代汉语词语分类体系介绍,计算语言学教学参考资料(北大计算语言所1993年7月编),PP35-57 ,其摘要“关于现代汉语词语的语法功能分类”刊载于《中国计算机报》,1994年5月31日第73版至第75版 

 [7] 王惠,朱学锋,《现代汉语语法电子词典》的收词原则,《中国计算机报》,1994年5月31日第79版至第83版 

 [8] 朱学锋,王惠,张芸芸,现代汉语语法电子词典中量词与名词的子类划分,《中国计算机报》,1994年5月31日第79版 

 [9] 周强,俞士汶,一个人机共生的汉语语料库多级加工处理系统CCMP,《计算语言学进展与应用》,清华大学出版社,1995年11月,PP50~55 

 本文刊登在新加坡《中文与东方语言信息处理学会通讯》,1995年第2期,PP81~86     
        
See More >>


Topic: plot not responding

Problem: Octave plots doesn't work smoothly

Solution: 

>> available_graphics_toolkits
ans =
{
  [1,1] = fltk
  [1,2] = gnuplot
  [1,3] = qt
}

>> graphics_toolkit()

switch it.

>> graphics_toolkit('qt') 
or 
>> graphics_toolkit('fltk') 
or
>> graphics_toolkit('gnuplot')  % for linux


Topic: conda vs pip

Quoting from the Conda blog:

Having been involved in the python world for so long, we are all aware of pip, easy_install, and virtualenv, but these tools did not meet all of our specific requirements. The main problem is that they are focused around Python, neglecting non-Python library dependencies, such as HDF5, MKL, LLVM, etc., which do not have a setup.py in their source code and also do not install files into Python’s site-packages directory.
So Conda is a packaging tool and installer that aims to do more than what pip does; handle library dependencies outside of the Python packages as well as the Python packages themselves. Conda also creates a virtual environment, like virtualenv does.

As such, Conda should be compared to Buildout perhaps, another tool that lets you handle both Python and non-Python installation tasks.

Because Conda introduces a new packaging format, you cannot use pip and Conda interchangeably;  pip cannot install the Conda package format. You can use the two tools side by side but they do not interoperate either.


Topic: conda-cheatsheet

Show >>


Topic: Matlab_based_IPython_notebook

Show >>


Topic: npy

The npy file format is documented in numpy's https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.rst.

For instance, the code

>>> dt=numpy.dtype([('outer','(3,)<i4'),
...                 ('outer2',[('inner','(10,)<i4'),('inner2','f8')])])
>>> a=numpy.array([((1,2,3),((10,11,12,13,14,15,16,17,18,19),3.14)),
...                ((4,5,6),((-1,-2,-3,-4,-5,-6,-7,-8,-9,-20),6.28))],dt)
>>> numpy.save('1.npy', a)

results in the file:

93 4E 55 4D 50 59                      magic ("\x93NUMPY")
01                                     major version (1)
00                                     minor version (0)

96 00                                  HEADER_LEN (0x0096 = 150)
7B 27 64 65 73 63 72 27 
3A 20 5B 28 27 6F 75 74 
65 72 27 2C 20 27 3C 69 
34 27 2C 20 28 33 2C 29 
29 2C 20 28 27 6F 75 74 
65 72 32 27 2C 20 5B 28 
27 69 6E 6E 65 72 27 2C 
20 27 3C 69 34 27 2C 20 
28 31 30 2C 29 29 2C 20 
28 27 69 6E 6E 65 72 32                Header, describing the data structure
27 2C 20 27 3C 66 38 27                "{'descr': [('outer', '<i4', (3,)),
29 5D 29 5D 2C 20 27 66                            ('outer2', [
6F 72 74 72 61 6E 5F 6F                               ('inner', '<i4', (10,)), 
72 64 65 72 27 3A 20 46                               ('inner2', '<f8')]
61 6C 73 65 2C 20 27 73                            )],
68 61 70 65 27 3A 20 28                  'fortran_order': False,
32 2C 29 2C 20 7D 20 20                  'shape': (2,), }"
20 20 20 20 20 20 20 20 
20 20 20 20 20 0A 

01 00 00 00 02 00 00 00 03 00 00 00    (1,2,3)
0A 00 00 00 0B 00 00 00 0C 00 00 00
0D 00 00 00 0E 00 00 00 0F 00 00 00
10 00 00 00 11 00 00 00 12 00 00 00
13 00 00 00                            (10,11,12,13,14,15,16,17,18,19)
1F 85 EB 51 B8 1E 09 40                3.14

04 00 00 00 05 00 00 00 06 00 00 00    (4,5,6)
FF FF FF FF FE FF FF FF FD FF FF FF
FC FF FF FF FB FF FF FF FA FF FF FF
F9 FF FF FF F8 FF FF FF F7 FF FF FF 
EC FF FF FF                            (-1,-2,-3,-4,-5,-6,-7,-8,-9,-20)
1F 85 EB 51 B8 1E 19 40                6.28


Topic: Tutorial

A 3-day tutorial on Python
https://zhuanlan.zhihu.com/p/21332075#!


Topic: Software Versions

Product versions:

Pre Alpha - Still in someone's editor. It does not run, or barely runs. In some cases, the UI is complete at this point and advertises all of the planned functionality. Not much works, its a prototype. Software at this stage should not be published, at all, except amongst developers.

Alpha - Something that doesn't crash all of the time and is not yet feature complete. At this point, it's probably going to hallway testing and a meeting will be held to differentiate usability issues from feature creep. Software at this stage should not be released for public testing.

Beta - Stable enough to let loose to a select group of users. Ideally, you know that every single user who receives a copy is capable of finding bugs and reporting them with an adequate degree of articulation. In a perfect world, hallway testing solves everything, but the world is not perfect. Power users are your friend in the beta period. Depending on the overall stability of the software, betas may be public. Unless the app 'just works' 95+ % of the time, the beta should be private.

Release Candidate (aka RC) - You will often have multiple release candidates. You put it out in beta, got great feedback, fixed a bunch of stuff and everything seems to work. Still, bugs hide well and you don't want an official release to flop. At this point, you want only bugfixes to what exists, you don't want to do any major surgery. It's expected, at this point that any drastic changes have been completed. Any revisions at this point should be trivial, at best.

Release - It works already, send it out into the wild, then start over again.

Next - Usually somewhat stable builds of the next version of the app, often called edge. Basically, the same as beta, but perhaps not for the faint of heart. The main difference is, it's based off the previous release version.


Module Versions:

[major].[minor].[release].[build]

major: Really a marketing decision. Are you ready to call the version 1.0? Does the company consider this a major version for which customers might have to pay more, or is it an update of the current major version which may be free? Less of an R&D decision and more a product decision.

minor: Starts from 0 whenever major is incremented. +1 for every version that goes public.

release: Every time you hit a development milestone and release the product, even internally (e.g. to QA), increment this. This is especially important for communication between teams in the organization. Needless to say, never release the same 'release' twice (even internally). Reset to 0 upon minor++ or major++.

build: Can be a SVN revision, I find that works best.


Topic: Combine All Visits Info Into Patient Record

------ For System.App.Web.ROP ---------

----- create intermediate view ------

IF OBJECT_ID('view_all_info_in_one_line') IS NOT NULL  
  DROP TABLE view_all_info_in_one_line
GO

create view view_all_info_in_one_line as 

SELECT VisitId, Newborn_Id, concat(
		convert(varchar, VisitTimeStamp, 126), ' ', 
		[Name], ' ',
		[VisitDiagnosis], ' '
      ,[VisitTreatment], ' '
      ,[SpecialtyExam], ' '
      ,[Vision]
      ,[OuterEye]
      ,[EyePosition]
      ,[DioptricMedia]
      ,[Fundus]
      ,[AuxiliaryExam]
      ,[DioptricStatus]
      ,[DioptricStatusODf1]
      ,[DioptricStatusODf2]
      ,[DioptricStatusODf3]
      ,[Bioassay]
      ,[FundusPhotograph]
      ,[OCT]
      ,[OuterInspection]
      
      ,[VisitCorrectedGestationalAgeWeek]
      ,[VisitCorrectedGestationalAgeDay]
      ,[VisitCorrectedGestationalAge]
      ,[VisitExaminationMethod]
      ,[VisitExaminationPhysician]
      ,[VisitOtherSituation]
      ,[VisitInspctionRemark]
      ,[InjectionDrug],
	   '  ',
	  (
             CASE 
                  WHEN [RegularReexamTimeInterval] is null 
                     THEN '' 
                  ELSE '���飺'+[RegularReexamTimeInterval]
             END) 

      ,[RegularReexamTimeInterval_1]
      ,[RegularReexamTimeInterval]
      ,[RegularReexamTimeIntervalUnit]
      ,[DioptricStatusOSf1]
      ,[DioptricStatusOSf2]
      ,[DioptricStatusOSf3]
      ,[MeanAxialOD]
      ,[MeanAxialOS]
      ,[AnteriorChamberDepthOD]
      ,[AnteriorChamberDepthOS]
      ,[CorneaCurvatureOD]
      ,[CorneaCurvatureOS]
      ,[VisitInjectionDrugOD]
      ,[VisitInjectionDrugOS]
      ,[VisitInjectionDosageOD]
      ,[VisitInjectionDosageOS]
      ,[VisitDiagnosisOD]
      ,[VisitDiagnosisODRegion]
      ,[VisitDiagnosisODStage]
      ,[VisitDiagnosisODPlus]
      ,[VisitDiagnosisODDegenerative]
      ,[VisitDiagnosisOS]
      ,[VisitDiagnosisOSRegion]
      ,[VisitDiagnosisOSStage]
      ,[VisitDiagnosisOSPlus]
      ,[VisitDiagnosisOSDegenerative]
      ,[VisitExaminationMethods]
      ,[LensThicknessOD]
      ,[LensThicknessOS]
      ,[Anterior]
      ,[Optometry]
      ,[EyePositionDetail]
      ,[CorneaCurvatureODk2]
      ,[CorneaCurvatureOSk2]) as visit_info
  FROM [Shenzhen].[dbo].[Visit]

GO

------ create intermediate view 2 -------

IF OBJECT_ID('view_all_visits_in_one_line') IS NOT NULL  
  DROP TABLE view_all_visits_in_one_line
GO

create view view_all_visits_in_one_line as

select Newborn_Id as Newborn_Id,
  stuff((SELECT ' || ' + visit_info
           FROM view_all_info_in_one_line v2
           where v2.Newborn_Id = v1.Newborn_Id
           FOR XML PATH('')),1,4,'') as visits_info
from view_all_info_in_one_line v1
group by Newborn_Id

GO

-------  result ------
IF OBJECT_ID('view_result') IS NOT NULL  
  DROP TABLE view_result
GO

create view view_result as

SELECT [Id]
      ,[Screening_ID]
      ,[Record_ID]
      ,[CopyImage]
      ,[MotherName]
      ,[BabyName]
      ,[Gender]
      ,[Birthday]
      ,[GestationalAgeInWeeks]
      ,[PregnantDays]
      ,[BirthWeight]
      ,[NumberOfFetus]
      ,[ChildbirthMethod]
      ,[ConceptionMethod]
      ,[BirthHospital]
      ,[NeonatologyAdmissionHospital]
      ,[Hometown]
      ,[Phone]
      ,[RegistrationTime]
      ,[CorrectedGestationalAge]
      ,[OcularInspection]
      ,[Diagnosis]
      ,[DiagnosisTime]
      ,[ExaminationPhysician]
      ,[OtherSituation]
      ,[InspctionRemark]
      ,[FatherAge]
      ,[MotherAge]
      ,[ReproductiveHistory]
      ,[OxygenHistory]
      ,[SystemicDiseaseHistory]
      ,[FamilyDiseaseHistory]
      ,[ParentsHistory]
      ,[Procedure]
      ,[Result]
      ,[Treatment]
      ,[FollowUpDuration]
      ,[SurgeryOperator]
      ,[PhotocoagulationParameter]
      ,[CondensationParameter]
      ,[ExaminationMethod]
      ,[OtherJointSurgery]
      ,[SurgeryRemark]
      ,[ThenTime]
      ,[FollowUpTime]
      ,[PostoperativeConditions]
      ,[OxygenMethod]
      ,[OxygenDuration]
      ,[OxygenDurationUnit]
      ,[OxygenConcentration]
      ,[Ventilator]
      ,[CPAP]
      ,[SystematicDisease]
      ,[SystematicDiseaseTreatment]
      ,[PregnancyStatus]
      ,[PregnancyMedication]
      ,[Apgar1min]
      ,[Apgar5min]
      ,[Apgar10min]
      ,[Homozygotic]
      ,[InitialDiagnosisTime]
      ,[MostSevereDiagnosisOD]
      ,[MostSevereDiagnosisODRegion]
      ,[MostSevereDiagnosisODStage]
      ,[MostSevereDiagnosisODPlus]
      ,[MostSevereDiagnosisODNOROP]
      ,[MostSevereDiagnosisOSNOROP]
      ,[MostSevereDiagnosisODAPROP]
      ,[MostSevereDiagnosisODDegenerative]
      ,[MostSevereDiagnosisODOther]
      ,[MostSevereDiagnosisOS]
      ,[MostSevereDiagnosisOSRegion]
      ,[MostSevereDiagnosisOSStage]
      ,[MostSevereDiagnosisOSPlus]
      ,[MostSevereDiagnosisOSAPROP]
      ,[MostSevereDiagnosisOSDegenerative]
      ,[MostSevereDiagnosisOSOther]
      ,[TimeStamp]
      ,[GeneticStudy]
      ,[GeneticStudyBloodDraw]
      ,[GeneticStudyResult]
      ,[Reserved1]
      ,[Tag01]
      ,[Tag02]
      ,[Tag03]
      ,[Tag04]
      ,[Tag05]
      ,[Tag06]
      ,[Tag07]
      ,[Tag08]
      ,[Tag09]
      ,[Tag10]
      ,[Tag11]
      ,[Tag12]
      ,[Tag13]
      ,[Tag14]
      ,[Tag15]
      ,[Tag16], v.visits_info
  FROM [Newborn] c LEFT OUTER JOIN
  view_all_visits_in_one_line v on c.Id = v.Newborn_Id

GO

select * from view_result

GO


Topic: Deidentification

SQL to deidentify patient/doctor names (in case of Chinese names):
> UPDATE table SET Name = '***' WHERE LEN(Name) = 3 
> UPDATE table SET Name = LEFT(Name,1) + '**' WHERE LEN(Name) = 3 
e.g.
UPDATE Newborn SET MotherName = '****' WHERE LEN(MotherName) = 4
UPDATE Newborn SET MotherName = '***' WHERE LEN(MotherName) = 3 
UPDATE Newborn SET MotherName = '**' WHERE LEN(MotherName) = 2 
UPDATE Newborn SET BabyName = '****' WHERE LEN(BabyName) = 4
UPDATE Newborn SET BabyName = '***' WHERE LEN(BabyName) = 3 
UPDATE Newborn SET BabyName = '**' WHERE LEN(BabyName) = 2

UPDATE Newborn SET BabyName = '****BB' WHERE LEN(MotherName) = 4 and BabyName like '%BB'
UPDATE Newborn SET BabyName = '***BB' WHERE LEN(MotherName) = 3 and BabyName like '%BB'
UPDATE Newborn SET BabyName = '**BB' WHERE LEN(MotherName) = 2 and BabyName like '%BB'

UPDATE Visit SET VisitExaminationPhysician = '****' WHERE LEN(VisitExaminationPhysician) = 4
UPDATE Visit SET VisitExaminationPhysician = '***' WHERE LEN(VisitExaminationPhysician) = 3 
UPDATE Visit SET VisitExaminationPhysician = '**' WHERE LEN(VisitExaminationPhysician) = 2 


SQL to mask mobile phone number (in case of China mainland):
> UPDATE table SET Phone = LEFT(Phone,7)+'****' WHERE LEN(phone) =11





Topic: FOR XML PATH

��SQL Server������ FOR XML PATH ����ܹ��Ѳ�ѯ����������XML���ݣ�����������һЩӦ��ʾ����



DECLARE @TempTable table(UserID int , UserName nvarchar(50));

insert into @TempTable (UserID,UserName) values (1,'a')

insert into @TempTable (UserID,UserName) values (2,'b')

  

select UserID,UserName from @TempTable FOR XML PATH 

������νű������������½����


<row>

  <UserID>1</UserID>

  <UserName>a</UserName>

</row>

<row>

  <UserID>2</UserID>

  <UserName>b</UserName>

</row> 


�޸�һ��PATH�IJ����� 


select UserID,UserName from @TempTable FOR XML PATH('x') 


�ٴ����������ű������������µĽ����


<x>

  <UserID>1</UserID>

  <UserName>a</UserName>

</x>

<x>

  <UserID>2</UserID>

  <UserName>b</UserName>

</x> 



���Կ����ڵ��ɣ���ʵPATH() �����ڵIJ����ǿ��ƽڵ����Ƶģ������Ļ���ҿ��Կ�һ������ǿ��ַ���������û�в���������ʲô���? 



select UserID,UserName from @TempTable FOR XML PATH('') 


ִ��������νű������ɽ����



<UserID>1</UserID>

<UserName>a</UserName>

<UserID>2</UserID>

<UserName>b</UserName> 


�����Ͳ���ʾ�ϼ��ڵ��ˣ����֪���� PATH ģʽ�У��������б�������Ϊ XPath ���ʽ�������Ҳ����˵�����е����֣�����������һ�²���ָ�������ͱ���������ô����


select CAST(UserID AS varchar) + '',UserName + '' from @TempTable FOR XML PATH('') 


����������佫���ɽ��


1a2b

�������ݶ�����һ�У����һ�û�������ַ������������ݿ��ܶԴ��û���ô����������ٱ仯һ�£�


select CAST(UserID AS varchar) + ',',UserName + '',';' from @TempTable FOR XML PATH('') 


 

���ɽ��


1,a;2,b;

����ͨ�����Ʋ����������Լ���Ҫ�Ľ�������磺

select '{' + CAST(UserID AS varchar) + ',','"' +UserName + '"','}' from @TempTable FOR XML PATH('') 
 

���ɽ��


{1,"a"}{2,"b"}

������һ������ͳ�Ƶ�Ӧ�ã�ϣ����ҿ���ͨ�������ʵ���뵽�����Ӧ��

 
DECLARE @T1 table(UserID int , UserName nvarchar(50),CityName nvarchar(50));

insert into @T1 (UserID,UserName,CityName) values (1,'a','�Ϻ�')

insert into @T1 (UserID,UserName,CityName) values (2,'b','����')

insert into @T1 (UserID,UserName,CityName) values (3,'c','�Ϻ�')

insert into @T1 (UserID,UserName,CityName) values (4,'d','����')

insert into @T1 (UserID,UserName,CityName) values (5,'e','�Ϻ�')

  

SELECT B.CityName,LEFT(UserList,LEN(UserList)-1) FROM (

SELECT CityName,

    (SELECT UserName+',' FROM @T1 WHERE CityName=A.CityName  FOR XML PATH('')) AS UserList

FROM @T1 A 

GROUP BY CityName

) B 


 

���ɽ����ÿ�����е��û�����


���� b,d
�Ϻ� a,c,e


Topic: GROUP_CONCAT

SELECT id,GROUP_CONCAT(title) AS title FROM doc GROUP BY id 


GROUP_CONCAT([DISTINCT] expr [,expr ...]

             [ORDER BY {unsigned_integer | col_name | expr}

                 [ASC | DESC] [,col_name ...]]

             [SEPARATOR str_val])

 
e.g.



 > SELECT student_name,

    ->     GROUP_CONCAT(test_score)

    ->     FROM student

    ->     GROUP BY student_name;

Or: 

> SELECT student_name,

    ->     GROUP_CONCAT(DISTINCT test_score

    ->               ORDER BY test_score DESC SEPARATOR ' ')

    ->     FROM student

    ->     GROUP BY student_name;


Topic: Linux sed command

Show >>


Topic: ubuntu_filesystem_hierarchy

man hier

NAME

       hier - description of the filesystem hierarchy

DESCRIPTION

       A typical Linux system has, among others, the following directories:

       /      This  is  the  root  directory.   This  is  where the whole tree
              starts.

       /bin   This directory contains executable programs which are needed  in
              single user mode and to bring the system up or repair it.

       /boot  Contains static files for the boot loader.  This directory holds
              only the files which are needed during the  boot  process.   The
              map  installer  and  configuration  files should go to /sbin and
              /etc.  The operating system kernel (initrd for example) must  be
              located in either / or /boot.

       /dev   Special  or  device files, which refer to physical devices.  See
              mknod(1).

       /etc   Contains configuration files which are  local  to  the  machine.
              Some  larger  software  packages,  like  X11, can have their own
              subdirectories below /etc.  Site-wide configuration files may be
              placed  here  or  in  /usr/etc.   Nevertheless,  programs should
              always look for these files in /etc and you may have  links  for
              these files to /usr/etc.

       /etc/opt
              Host-specific   configuration   files  for  add-on  applications
              installed in /opt.

       /etc/sgml
              This  directory  contains  the  configuration  files  for   SGML
              (optional).

       /etc/skel
              When  a  new  user account is created, files from this directory
              are usually copied into the user's home directory.

       /etc/X11
              Configuration files for the X11 window system (optional).

       /etc/xml
              This  directory  contains  the  configuration  files   for   XML
              (optional).

       /home  On  machines  with home directories for users, these are usually
              beneath this directory, directly or not.  The structure of  this
              directory depends on local administration decisions (optional).

       /lib   This  directory  should  hold  those  shared  libraries that are
              necessary to boot the system and to run the commands in the root
              filesystem.

       /lib<qual>
              These  directories  are variants of /lib on system which support
              more  than  one  binary  format  requiring  separate   libraries
              (optional).

       /lib/modules
              Loadable kernel modules (optional).

       /lost+found
              This  directory  contains  items  lost in the filesystem.  These
              items are usually chunks of files mangled as a consequence of  a
              faulty disk or a system crash.

       /media This directory contains mount points for removable media such as
              CD and DVD disks or USB sticks.  On systems where more than  one
              device  exists  for  mounting  a  certain  type  of media, mount
              directories can be created by appending a digit to the  name  of
              those  available  above  starting  with '0', but the unqualified
              name must also exist.

       /media/floppy[1-9]
              Floppy drive (optional).

       /media/cdrom[1-9]
              CD-ROM drive (optional).

       /media/cdrecorder[1-9]
              CD writer (optional).

       /media/zip[1-9]
              Zip drive (optional).

       /media/usb[1-9]
              USB drive (optional).

       /mnt   This directory is  a  mount  point  for  a  temporarily  mounted
              filesystem.  In some distributions, /mnt contains subdirectories
              intended to be  used  as  mount  points  for  several  temporary
              filesystems.

       /opt   This  directory  should  contain  add-on  packages  that contain
              static files.

       /proc  This is a mount point for the proc  filesystem,  which  provides
              information  about  running  processes  and  the  kernel.   This
              pseudo-filesystem is described in more detail in proc(5).

       /root  This directory is usually the home directory for the  root  user
              (optional).

       /sbin  Like  /bin,  this  directory  holds  commands needed to boot the
              system, but which are usually not executed by normal users.

       /srv   This directory contains site-specific data  that  is  served  by
              this system.

       /sys   This  is  a mount point for the sysfs filesystem, which provides
              information about the kernel like /proc, but better  structured,
              following the formalism of kobject infrastructure.

       /tmp   This  directory  contains  temporary  files which may be deleted
              with no notice, such as by a regular job or at system boot up.

       /usr   This directory is usually mounted from a separate partition.  It
              should  hold  only  sharable,  read-only data, so that it can be
              mounted by various machines running Linux.

       /usr/X11R6
              The X-Window system, version 11 release 6 (optional).

       /usr/X11R6/bin
              Binaries which belong to the X-Window system; often, there is  a
              symbolic link from the more traditional /usr/bin/X11 to here.

       /usr/X11R6/lib
              Data files associated with the X-Window system.

       /usr/X11R6/lib/X11
              These contain miscellaneous files needed to run X;  Often, there
              is a symbolic link from /usr/lib/X11 to this directory.

       /usr/X11R6/include/X11
              Contains include files needed for compiling programs  using  the
              X11  window  system.   Often,  there  is  a  symbolic  link from
              /usr/include/X11 to this directory.

       /usr/bin
              This is the primary directory  for  executable  programs.   Most
              programs  executed  by  normal  users  which  are not needed for
              booting or for repairing the system and which are not  installed
              locally should be placed in this directory.

       /usr/bin/mh
              Commands for the MH mail handling system (optional).

       /usr/bin/X11
              is  the traditional place to look for X11 executables; on Linux,
              it usually is a symbolic link to /usr/X11R6/bin.

       /usr/dict
              Replaced by /usr/share/dict.

       /usr/doc
              Replaced by /usr/share/doc.

       /usr/etc
              Site-wide configuration  files  to  be  shared  between  several
              machines  may  be  stored  in this directory.  However, commands
              should always reference those files using  the  /etc  directory.
              Links  from  files in /etc should point to the appropriate files
              in /usr/etc.

       /usr/games
              Binaries for games and educational programs (optional).

       /usr/include
              Include files for the C compiler.

       /usr/include/bsd
              BSD compatibility include files (optional).

       /usr/include/X11
              Include files for the C compiler and the X-Window system.   This
              is usually a symbolic link to /usr/X11R6/include/X11.

       /usr/include/asm
              Include files which declare some assembler functions.  This used
              to be a symbolic link to /usr/src/linux/include/asm.

       /usr/include/linux
              This contains information which may change from  system  release
              to   system   release   and  used  to  be  a  symbolic  link  to
              /usr/src/linux/include/linux to get at operating-system-specific
              information.

              (Note  that  one  should  have  include  files  there  that work
              correctly with the current libc and  in  user  space.   However,
              Linux  kernel  source  is  not  designed  to  be  used with user
              programs and does not know  anything  about  the  libc  you  are
              using.   It  is  very  likely  that things will break if you let
              /usr/include/asm and /usr/include/linux point at a random kernel
              tree.  Debian systems don't do this and use headers from a known
              good kernel version, provided in the libc*-dev package.)

       /usr/include/g++
              Include files to use with the GNU C++ compiler.

       /usr/lib
              Object  libraries,  including  dynamic  libraries,   plus   some
              executables  which  usually  are  not  invoked  directly.   More
              complicated programs may have whole subdirectories there.

       /usr/lib<qual>
              These directories are  variants  of  /usr/lib  on  system  which
              support   more   than   one  binary  format  requiring  separate
              libraries, except that the symbolic link  /usr/lib<qual>/X11  is
              not required (optional).

       /usr/lib/X11
              The  usual  place for data files associated with X programs, and
              configuration files for the  X  system  itself.   On  Linux,  it
              usually is a symbolic link to /usr/X11R6/lib/X11.

       /usr/lib/gcc-lib
              contains  executables  and include files for the GNU C compiler,
              gcc(1).

       /usr/lib/groff
              Files for the GNU groff document formatting system.

       /usr/lib/uucp
              Files for uucp(1).

       /usr/local
              This is where programs which are local to the site typically go.

       /usr/local/bin
              Binaries for programs local to the site.

       /usr/local/doc
              Local documentation.

       /usr/local/etc
              Configuration files associated with locally installed programs.

       /usr/local/games
              Binaries for locally installed games.

       /usr/local/lib
              Files associated with locally installed programs.

       /usr/local/lib<qual>
              These directories are variants of /usr/local/lib on system which
              support more than one binary format requiring separate libraries
              (optional).

       /usr/local/include
              Header files for the local C compiler.

       /usr/local/info
              Info pages associated with locally installed programs.

       /usr/local/man
              Man pages associated with locally installed programs.

       /usr/local/sbin
              Locally installed programs for system administration.

       /usr/local/share
              Local application  data  that  can  be  shared  among  different
              architectures of the same OS.

       /usr/local/src
              Source code for locally installed software.

       /usr/man
              Replaced by /usr/share/man.

       /usr/sbin
              This    directory   contains   program   binaries   for   system
              administration which are not essential for the boot process, for
              mounting /usr, or for system repair.

       /usr/share
              This directory contains subdirectories with specific application
              data, that can be shared among different  architectures  of  the
              same  OS.   Often  one  finds  stuff  here  that used to live in
              /usr/doc or /usr/lib or /usr/man.

       /usr/share/dict
              Contains the word lists used by spell checkers (optional).

       /usr/share/dict/words
              List of English words (optional).

       /usr/share/doc
              Documentation about installed programs (optional).

       /usr/share/games
              Static data files for games in /usr/games (optional).

       /usr/share/info
              Info pages go here (optional).

       /usr/share/locale
              Locale information goes here (optional).

       /usr/share/man
              Manual pages go here in subdirectories according to the man page
              sections.

       /usr/share/man/<locale>/man[1-9]
              These  directories  contain manual pages for the specific locale
              in source code form.  Systems which use a  unique  language  and
              code set for all manual pages may omit the <locale> substring.

       /usr/share/misc
              Miscellaneous   data   that   can   be  shared  among  different
              architectures of the same OS.

       /usr/share/nls
              The  message  catalogs  for  native  language  support  go  here
              (optional).

       /usr/share/sgml
              Files for SGML (optional).

       /usr/share/sgml/docbook
              DocBook DTD (optional).

       /usr/share/sgml/tei
              TEI DTD (optional).

       /usr/share/sgml/html
              HTML DTD (optional).

       /usr/share/sgml/mathtml
              MathML DTD (optional).

       /usr/share/terminfo
              The database for terminfo (optional).

       /usr/share/tmac
              Troff macros that are not distributed with groff (optional).

       /usr/share/xml
              Files for XML (optional).

       /usr/share/xml/docbook
              DocBook DTD (optional).

       /usr/share/xml/xhtml
              XHTML DTD (optional).

       /usr/share/xml/mathml
              MathML DTD (optional).

       /usr/share/zoneinfo
              Files for timezone information (optional).

       /usr/src
              Source  files  for  different parts of the system, included with
              some packages for reference purposes.  Don't work here with your
              own  projects,  as  files  below /usr should be read-only except
              when installing software (optional).

       /usr/src/linux
              This was the traditional place  for  the  kernel  source.   Some
              distributions  put  here  the source for the default kernel they
              ship.  You should probably use another directory  when  building
              your own kernel.

       /usr/tmp
              Obsolete.   This  should  be  a  link to /var/tmp.  This link is
              present only for compatibility reasons and shouldn't be used.

       /var   This directory contains files which may change in size, such  as
              spool and log files.

       /var/account
              Process accounting logs (optional).

       /var/adm
              This  directory  is  superseded  by  /var/log  and  should  be a
              symbolic link to /var/log.

       /var/backups
              Reserved for historical reasons.

       /var/cache
              Data cached for programs.

       /var/cache/fonts
              Locally-generated fonts (optional).

       /var/cache/man
              Locally-formatted man pages (optional).

       /var/cache/www
              WWW proxy or cache data (optional).

       /var/cache/<package>
              Package specific cache data (optional).

       /var/catman/cat[1-9] or /var/cache/man/cat[1-9]
              These directories contain preformatted manual pages according to
              their  man  page section.  (The use of preformatted manual pages
              is deprecated.)

       /var/crash
              System crash dumps (optional).

       /var/cron
              Reserved for historical reasons.

       /var/games
              Variable game data (optional).

       /var/lib
              Variable state information for programs.

       /var/lib/hwclock
              State directory for hwclock (optional).

       /var/lib/misc
              Miscellaneous state data.

       /var/lib/xdm
              X display manager variable data (optional).

       /var/lib/<editor>
              Editor backup files and state (optional).

       /var/lib/<name>
              These directories must be used for  all  distribution  packaging
              support.

       /var/lib/<package>
              State data for packages and subsystems (optional).

       /var/lib/<pkgtool>
              Packaging support files (optional).

       /var/local
              Variable data for /usr/local.

       /var/lock
              Lock  files are placed in this directory.  The naming convention
              for device lock files is LCK..<device>  where  <device>  is  the
              device's name in the filesystem.  The format used is that of HDU
              UUCP lock files, that is, lock files contain a PID as a  10-byte
              ASCII decimal number, followed by a newline character.

       /var/log
              Miscellaneous log files.

       /var/opt
              Variable data for /opt.

       /var/mail
              Users' mailboxes.  Replaces /var/spool/mail.

       /var/msgs
              Reserved for historical reasons.

       /var/preserve
              Reserved for historical reasons.

       /var/run
              Run-time  variable files, like files holding process identifiers
              (PIDs) and  logged  user  information  (utmp).   Files  in  this
              directory are usually cleared when the system boots.

       /var/spool
              Spooled (or queued) files for various programs.

       /var/spool/at
              Spooled jobs for at(1).

       /var/spool/cron
              Spooled jobs for cron(8).

       /var/spool/lpd
              Spooled files for printing (optional).

       /var/spool/lpd/printer
              Spools for a specific printer (optional).

       /var/spool/mail
              Replaced by /var/mail.

       /var/spool/mqueue
              Queued outgoing mail (optional).

       /var/spool/news
              Spool directory for news (optional).

       /var/spool/rwho
              Spooled files for rwhod(8) (optional).

       /var/spool/smail
              Spooled files for the smail(1) mail delivery program.

       /var/spool/uucp
              Spooled files for uucp(1) (optional).

       /var/tmp
              Like  /tmp,  this  directory holds temporary files stored for an
              unspecified duration.

       /var/yp
              Database files for NIS, formerly known as the Sun  Yellow  Pages
              (YP).


Topic: AccessControl

为相应的Controllor及Method添加[Authorize]标注


Topic: An Introduction to JavaScript Object Notation (JSON)

                

 United States (English)  

Sign in 



 Home Library Learn Samples Downloads Support Community Forums     



 MSDN Library  

 .NET Development  

 Articles and Overviews  

 Web Applications (ASP.NET)  

 ASP.NET  

 Client-side Development  

 An Introduction to JavaScript Object Notation (JSON) in JavaScript and .NET  

 Design ASP.NET Pages and Controls That Take Advantage of the DHTML Object Model, Part I  

 Design ASP.NET Pages and Controls That Take Advantage of the DHTML Object Model, Part II  

 Injecting Client-Side Script from an ASP.NET Server Control  

 Life Without Refresh  

 Using JavaScript Along with ASP.NET  

 Working with Client-Side Script        

  261 out of 375 rated this helpful - Rate this topic   

 An Introduction to JavaScript Object Notation (JSON) in JavaScript and .NET 

   

Atif Aziz, Scott Mitchell 

February 2007 

Applies to:    JSON    Ajax 

Summary: This article discusses JavaScript Object Notation (or JSON), an open and text-based data exchange format, that provides a standardized data exchange format better suited for Ajax-style web applications. (22 printed pages) Contents 

Introduction Understanding Literal Notation in JavaScript Comparing JSON to XML Creating and Parsing JSON Messages with JavaScript Working with JSON in the .NET Framework Conclusion References 

Download the source code for this article. Introduction 

When designing an application that will communicate with a remote computer, a data format and exchange protocol must be selected. There are a variety of open, standardized options, and the ideal choice depends on the applications requirements and pre-existing functionality. For example, SOAP-based web services format the data in an XML payload wrapped within a SOAP envelope. 

While XML works well for many application scenarios, it has some drawbacks that make it less than ideal for others. One such space where XML is often less than ideal is with Ajax-style web applications. Ajax is a technique used for building interactive web applications that provide a snappier user experience through the use of out-of-band, lightweight calls to the web server in lieu of full-page postbacks. These asynchronous calls are initiated on the client using JavaScript and involve formatting data, sending it to a web server, and parsing and working with the returned data. While most browsers can construct, send, and parse XML, JavaScript Object Notation (or JSON) provides a standardized data exchange format that is better-suited for Ajax-style web applications. 

JSON is an open, text-based data exchange format (see RFC 4627). Like XML, it is human-readable, platform independent, and enjoys a wide availability of implementations. Data formatted according to the JSON standard is lightweight and can be parsed by JavaScript implementations with incredible ease, making it an ideal data exchange format for Ajax web applications. Since it is primarily a data format, JSON is not limited to just Ajax web applications, and can be used in virtually any scenario where applications need to exchange or store structured information as text. 

This article examines the JSON standard, its relationship to JavaScript, and how it compares to XML. Jayrock, an open-source JSON implementation for .NET, is discussed and examples of creating and parsing JSON messages are provided in JavaScript and C#. Understanding Literal Notation in JavaScript 

Literals are used in programming languages to literally express fixed values, such as the constant integer value of 4, or the string "Hello, World." Literals can be used in most languages wherever an expression is allowed, such as part of a condition in a control statement, an input parameter when calling a function, in variable assignment, and so forth. For example, the following C# and Visual Basic code initializes the variable x with the constant integer value of 42. 

 Copy   

 int x = 42; // C# Dim x As Integer = 42 ' Visual Basic     

Different programming languages allow for literals of different types. Most programming languages support, at minimum, literals for scalar types like integers, floating-point numbers, strings, and Boolean. What's interesting about JavaScript is that in addition to scalar types, it also supports literals for structured types like arrays and objects. This feature allows for a terse syntax for on-demand creation and initialization of arrays and objects. 

Array literals in JavaScript are composed of zero or more expressions, with each expression representing an element of the array. The array elements are enclosed in square brackets ([]) and delimited by commas. The following example defines an array literally with seven string elements holding the names of the seven continents: 

 Copy   

 var continents = ["Europe", "Asia", "Australia", "Antarctica", "North America", "South America", "Africa"]; alert(continents[0] + " is one of the " + continents.length + " continents.");     

Compare this now to how you would create and initialize an array in JavaScript without the literal notation: 

 Copy   

 var continents = new Array(); continents[0] = "Europe"; continents[1] = "Asia"; continents[2] = "Australia"; continents[3] = "Antarctica"; continents[4] = "North America"; continents[5] = "South America"; continents[6] = "Africa";     

An object literal defines the members of an object and their values. The list of object members and values is enclosed in curly braces ({}) and each member is delimited by a comma. Within each member, the name and value are delimited by a colon (:). The following example creates an object and initializes it with three members named Address, City, and PostalCode with respective values "123 Anywhere St.", "Springfield", and "99999." 

 Copy   

 var mailingAddress = { "Address" : "123 Anywhere St.", "City" : "Springfield", "PostalCode" : 99999 }; alert("The package will be shipped to postal code " + mailingAddress.PostalCode);     

The examples presented thus far illustrate using string and numeric literals within array and object literals. You can also express an entire graph by using the notation recursively such that array elements and object member values can themselves, in turn, use object and array literals. For example, the following snippet illustrates an object that has an array as a member (PhoneNumbers), where the array is composed of a list of objects. 

 Copy   

 var contact = { "Name": "John Doe", "PermissionToCall": true, "PhoneNumbers": [ { "Location": "Home", "Number": "555-555-1234" }, { "Location": "Work", "Number": "555-555-9999 Ext. 123" } ] }; if (contact.PermissionToCall) { alert("Call " + contact.Name + " at " + contact.PhoneNumbers[0].Number); }     

 Note   A more thorough discussion of literal support for JavaScript can be found in the Core JavaScript 1.5 Guide under the Literals section. From JavaScript Literals to JSON 

JSON is a data exchange format that was created from a subset of the literal object notation in JavaScript. While the syntax accepted by JavaScript for literal values is very flexible, it is important to note that JSON has much stricter rules. According to the JSON standard, for example, the name of an object member must be a valid JSON string. A string in JSON must be enclosed in quotation marks. JavaScript, on the other hand, allows object member names to be delimited by quotation marks or apostrophes or to omit quoting altogether so long as the member name doesn't conflict with a reserved JavaScript keyword. Likewise, an array element or an object member value in JSON is limited to a very limited set. In JavaScript, however, array elements and object member values can refer to pretty much any valid JavaScript expression, including function calls and definitions! 

The charm of JSON is in its simplicity. A message formatted according to the JSON standard is composed of a single top-level object or array. The array elements and object values can be objects, arrays, strings, numbers, Boolean values (true and false), or null. That, in a nutshell, is the JSON standard! It's really that simple. See www.json.org or RFC 4627 for a more formal description of the standard. 

One of the sore points of JSON is the lack of a date/time literal. Many people are surprised and disappointed to learn this when they first encounter JSON. The simple explanation (consoling or not) for the absence of a date/time literal is that JavaScript never had one either: The support for date and time values in JavaScript is entirely provided through the Date object. Most applications using JSON as a data format, therefore, generally tend to use either a string or a number to express date and time values. If a string is used, you can generally expect it to be in the ISO 8601 format. If a number is used, instead, then the value is usually taken to mean the number of milliseconds in Universal Coordinated Time (UTC) since epoch, where epoch is defined as midnight January 1, 1970 (UTC). Again, this is a mere convention and not part of the JSON standard. If you are exchanging data with another application, you will need to check its documentation to see how it encodes date and time values within a JSON literal. For example, Microsoft's ASP.NET AJAX uses neither of the described conventions. Rather, it encodes .NET DateTime values as a JSON string, where the content of the string is \/Date(ticks)\/ and where ticks represents milliseconds since epoch (UTC). So November 29, 1989, 4:55:30 AM, in UTC is encoded as "\/Date(628318530718)\/". For some rationale behind this rather contrived choice of encoding, see "Inside ASP.NET AJAX’s JSON date and time string." Comparing JSON to XML 

Both JSON and XML can be used to represent native, in-memory objects in a text-based, human-readable, data exchange format. Furthermore, the two data exchange formats are isomorphic—given text in one format, an equivalent one is conceivable in the other. For example, when calling one of Yahoo!'s publicly accessible web services, you can indicate via a querystring parameter whether the response should be formatted as XML or JSON. Therefore, when deciding upon a data exchange format, it's not a simple matter of choosing one over the other as a silver bullet, but rather what format has the characteristics that make it the best choice for a particular application. For example, XML has its roots in marking-up document text and tends to shine very well in that space (as is evident with XHTML). JSON, on the other hand, has its roots in programming language types and structures and therefore provides a more natural and readily available mapping to exchange structured data. Beyond these two starting points, the following table will help you to understand and compare the key characteristics of XML and JSON. 

Key Characteristic Differences between XML and JSON 


CharacteristicXMLJSON 

	Data types	Does not provide any notion of data types. One must rely on XML Schema for adding type information.	Provides scalar data types and the ability to express structured data through arrays and objects. 

	Support for arrays	Arrays have to be expressed by conventions, for example through the use of an outer placeholder element that models the arrays contents as inner elements. Typically, the outer element uses the plural form of the name used for inner elements.	Native array support. 

	Support for objects	Objects have to be expressed by conventions, often through a mixed use of attributes and elements.	Native object support. 

	Null support	Requires use of xsi:nil on elements in an XML instance document plus an import of the corresponding namespace.	Natively recognizes the null value. 

	Comments	Native support and usually available through APIs.	Not supported. 

	Namespaces	Supports namespaces, which eliminates the risk of name collisions when combining documents. Namespaces also allow existing XML-based standards to be safely extended.	No concept of namespaces. Naming collisions are usually avoided by nesting objects or using a prefix in an object member name (the former is preferred in practice). 

	Formatting decisions	Complex. Requires a greater effort to decide how to map application types to XML elements and attributes. Can create heated debates whether an element-centric or attribute-centric approach is better.	Simple. Provides a much more direct mapping for application data. The only exception may be the absence of date/time literal. 

	Size	Documents tend to be lengthy in size, especially when an element-centric approach to formatting is used.	Syntax is very terse and yields formatted text where most of the space is consumed (rightly so) by the represented data. 

	Parsing in JavaScript	Requires an XML DOM implementation and additional application code to map text back into JavaScript objects.	No additional application code required to parse text; can use JavaScript's eval function. 

	Learning curve	Generally tends to require use of several technologies in concert: XPath, XML Schema, XSLT, XML Namespaces, the DOM, and so on.	Very simple technology stack that is already familiar to developers with a background in JavaScript or other dynamic programming languages.  

JSON is a relatively new data exchange format and does not have the years of adoption or vendor support that XML enjoys today (although JSON is catching up quickly). The following table highlights the current state of affairs in the XML and JSON spaces. 

Support Differences between XML and JSON 


SupportXMLJSON 

	Tools	Enjoys a mature set of tools widely available from many industry vendors.	Rich tool support—such as editors and formatters—is scarce. 

	Microsoft .NET Framework	Very good and mature support since version 1.0 of the .NET Framework. XML support is available as part of the Base Class Library (BCL). For unmanaged environments, there is MSXML.	None so far, except an initial implementation as part of ASP.NET AJAX. 

	Platform and language	Parsers and formatters are widely available on many platforms and languages (commercial and open source implementations).	Parsers and formatters are available already on many platforms and in many languages. Consult json.org for a good set of references. Most implementations for now tend to be open source projects. 

	Integrated language	Industry vendors are currently experimenting with support literally within languages. See Microsoft's LINQ project for more information.	Is natively supported in JavaScript/ECMAScript only.  

 Note   Neither table is meant to be a comprehensive list of comparison points. There are further angles on which both data formats can be compared, but we felt that these key points should be sufficient to build an initial impression. Creating and Parsing JSON Messages with JavaScript 

When using JSON as the data exchange format, two common tasks are turning a native and in-memory representation into its JSON text representation and vice versa. Unfortunately, at the time of writing, JavaScript does not provide built-in functions to create JSON text from a given object or array. These methods are expected to be included in the fourth edition of the ECMAScript standard in 2007. Until these JSON formatting functions are formally added to JavaScript and widely available across popular implementations, use the reference implementation script available for download at http://www.json.org/json.js. 

In its latest iteration at the time of this writing, the json.js script at www.json.org adds toJSONString() functions to array, string, Boolean, object, and other JavaScript types. The toJSONString() functions for scalar types (like Number and Boolean) are quite simple since they only need to return a string representation of the instance value. The toJSONString() function for the Boolean type, for example, returns the string "true" if the value is true, and "false" otherwise. The toJSONString() functions for Array and Object types are more interesting. For Array instances, the toJSONString() function for each contained element is called in sequence, with the results being concatenated with commas to delimit each result. The final output enclosed in square brackets. Likewise, for Object instances, each member is enumerated and its toJSONString() function invoked. The member name and the JSON representation of its value are concatenated with a colon in the middle; each member name and value pair is delimited with a comma and the entire output is enclosed in curly brackets. 

The net result of the toJSONString() functions is that any type can be converted into its JSON format with a single function call. The following JavaScript creates an Array object and adds seven String elements deliberately using the verbose and non-literal method for illustrative purposes. It then goes on to displays the arrays JSON representation: 

 Copy   

 // josn.js must be included prior to this point var continents = new Array(); continents.push("Europe"); continents.push("Asia"); continents.push("Australia"); continents.push("Antarctica"); continents.push("North America"); continents.push("South America"); continents.push("Africa"); alert("The JSON representation of the continents array is: " + continents.toJSONString());     


Figure 1. The toJSONString() function emits the array formatted according to the JSON standard. 

Parsing JSON text is even simpler. Since JSON is merely a subset of JavaScript literals, it can be parsed into an in-memory representation using the eval(expr) function, treating the source JSON text as JavaScript source code. The eval function accepts as input a string of valid JavaScript code and evaluates the expression. Consequently, the following single line of code is all that is needed to turn JSON text into a native representation: 

 Copy   

 var value = eval( "(" + jsonText + ")" );     

 Note   The extra parentheses are used make eval unconditionally treat the source input like an expression. This is especially important for objects. If you try to call eval with a string containing JSON text that defines an object, such as the string "{}" (meaning an empty object), then it simply returns undefined as the parsed result. The parentheses force the JavaScript parser to see the top-level curly braces as the literal notation for an Object instance rather than, say, curly braces defining a statement block. Incidentally, the same problem does not occur if the top-level item is an array, as in eval("[1,2,3]"). For sake of uniformity, however, JSON text should always be surrounded with parentheses prior to calling eval so that there is no ambiguity about how to interpret the source. 

When evaluating literal notation, an instance corresponding to the literal syntax is returned and assigned to value. Consider the following example, which uses the eval function to parse the literal notation for an array and assigning the resulting array to the variable continents. 

 Copy   

 var arrayAsJSONText = '["Europe", "Asia", "Australia", "Antarctica", "North America", "South America", "Africa"]'; var continents = eval( arrayAsJSONText ); alert(continents[0] + " is one of the " + continents.length + " continents.");     

Of course, in practice the evaluated JSON text will come from some external source rather than being hard-coded as in the above case. 

The eval function blindly evaluates whatever expression it is passed. An untrustworthy source could therefore include potentially dangerous JavaScript along with or mixed into the literal notation that makes up the JSON data. In scenarios where the source cannot be trusted, it is highly recommended that you parse the JSON text using the parseJSON() function (found in json.js): 

 Copy   

 // Requires json.js var continents = arrayAsJSONText.parseJSON();     

The parseJSON() function also uses eval, but only if the string contained in arrayAsJSONText conforms to the JSON text standard. It does this using a clever regular expression test. Working with JSON in the .NET Framework 

JSON text can easily be created and parsed from JavaScript code, which is part of its allure. However, when JSON is used in an ASP.NET web application, only the browser enjoys JavaScript support since the server-side code is most likely written in Visual Basic or C#. 

Most Ajax libraries designed for ASP.NET provide support for programmatically creating and parsing JSON text. Therefore, to work with JSON in a .NET application, consider using one of these libraries. There are plenty of open-source and third-party options, and Microsoft also has their own Ajax library named ASP.NET AJAX. 

In this article we will look at examples that use Jayrock, an open-source implementation of JSON for the Microsoft .NET Framework created by coauthor Atif Aziz. We chose to use Jayrock instead of ASP.NET AJAX for three reasons:  
Jayrock is open-source, making it possible to extend or customize as needed. 
Jayrock can be used in ASP.NET 1.x, 2.0, and Mono applications, whereas ASP.NET AJAX is for ASP.NET version 2.0 only. 
Jayrock's scope is limited to JSON and JSON-RPC, and the former is the main focus of this article. While ASP.NET AJAX includes some support for creating and parsing JSON text, its primary purpose is to offer a rich platform for building end-to-end Ajax-style web applications in ASP.NET. The extra bells and whistles can be distracting when your main focus is JSON.  

Working with JSON in .NET using Jayrock is similar to working with XML through the XmlWriter, XmlReader, and XmlSerializer classes in the .NET Framework. The classes JsonWriter, JsonReader, JsonTextWriter, and JsonTextReader found in Jayrock mimic the semantics of the .NET Framework classes XmlWriter, XmlReader, XmlTextWriter, and XmlTextReader. These classes are useful for interfacing with JSON at a low- and stream-oriented level. Using these classes, JSON text can be created or parsed piecemeal through a series of method calls. For example, using the JsonWriter class method WriteNumber(number) writes out the appropriate string representation of number according to the JSON standard. The JsonConvert class offers Export and Import methods for converting between .NET types and JSON. These methods provide a similar functionality as found in the XmlSerializer class methods Serialize and Deserialize, respectively. Creating JSON Text 

The following code illustrates using the JsonTextWriter class to create the JSON text for a string array of continents. This JSON text is sent to a TextWriter instance passed into the constructor, which happens to be the output stream from the console in this example (in ASP.NET you can use Response.Output instead): 

 Copy   

 using (JsonTextWriter writer = JsonTextWriter(Console.Out)) { writer.WriteStartArray(); writer.WriteString("Europe"); writer.WriteString("Asia"); writer.WriteString("Australia"); writer.WriteString("Antarctica"); writer.WriteString("North America"); writer.WriteString("South America"); writer.WriteString("Africa"); writer.WriteEndArray(); }     

In addition to the WriteStartArray, WriteString, and WriteEndArray methods, the JsonWriter class provides methods for writing other JSON value types, such as WriteNumber, WriteBoolean, WriteNull, and so on. The WriteStartObject, WriteEndObject, and WriteMember methods create the JSON text for an object. The following example illustrates creating the JSON text for the contact object examined in the "Understanding Literal Notation in JavaScript" section: 

 Copy   

 private static void WriteContact() {     using (JsonWriter w = new JsonTextWriter(Console.Out))     {         w.WriteStartObject();              // {         w.WriteMember("Name");             //   "Name" :         w.WriteString("John Doe");         //     "John Doe",         w.WriteMember("PermissionToCall"); //   "PermissionToCall" :         w.WriteBoolean(true);              //     true,         w.WriteMember("PhoneNumbers");     //   "PhoneNumbers" :         w.WriteStartArray();               //   [         WritePhoneNumber(w,                //     { "Location": "Home",             "Home"                         //       "Number":             "555-555-1234");               //         "555-555-1234" },         WritePhoneNumber(w,                //     { "Location": "Work",             "Work",                        //       "Number":             "555-555-9999");               //       "555-555-9999" }         w.WriteEndArray();                 //   ]         w.WriteEndObject();                // }     } } private static void WritePhoneNumber(JsonWriter w, string location,     string number) {     w.WriteStartObject();      //  {     w.WriteMember("Location"); //      "Location" :     w.WriteString(location);   //          "...",     w.WriteMember("Number");   //      "Number" :     w.WriteString(number);     //          "..."     w.WriteEndObject();        //  } }     

Export and ExportToString methods in the JsonConvert class can be used to serialize a specified .NET type into JSON text. For example, rather than manually building the JSON text for the array of the seven continents using the JsonTextWriter class, the following call to JsonConvert.ExportToString produces the same results: 

 Copy   

 string[] continents = {       "Europe", "Asia", "Australia", "Antarctica", "North America",       "South America", "Africa" }; string jsonText = JsonConvert.ExportToString(continents);     Parsing JSON Text 

The JsonTextReader class provides a variety of methods to parse the tokens of JSON text with the core one being Read. Each time the Read method is invoked, the parser consumes the next token, which could be a string value, a number value, an object member name, the start of an array, and so forth. Where applicable, the parsed text of the current token can be accessed via the Text property. For example, if the reader is sitting on Boolean data, then the Text property will return "true" or "false" depending on the actual parse value. 

The following sample code uses the JsonTextReader class to parse through the JSON text representation of a string array containing the names of the seven continents. Each continent that begins with the letter "A" is sent to the console: 

 Copy   

 string jsonText = @"[""Europe"", ""Asia"", ""Australia"", ""Antarctica"", ""North America"", ""South America"", ""Africa""]"; using (JsonTextReader reader = new JsonTextReader(new StringReader(jsonText))) { while (reader.Read()) { if (reader.TokenClass == JsonTokenClass.String  reader.Text.StartsWith("A")) { Console.WriteLine(reader.Text); } } }     

 Note   The JsonTextReader class in Jayrock is a fairly liberal JSON text parser. It actually permits a lot more syntax than is considered valid JSON text according to rules laid out in RFC 4627. For example, the JsonTextReader class allows single-line and multi-line comments to appear within JSON text as you'd expect in JavaScript. Single-line comments start with slash-slash (//) and multi-line comments being with slash-star (/*) and end in star-slash (*/). Single-line comments can even begin with the hash/pound sign (#), which is common among Unix-style configuration files. In all instances, the comments are completely skipped by the parser and never exposed through the API. Also as in JavaScript, JsonTextReader permits a JSON string to be delimited by an apostrophe ('). The parser can even tolerate an extra comma after the last member of an object or element of an array. 

 Even with all these additions, JsonTextReader is a conforming parser! JsonTextWriter, on the other hand, produces only strict standard-conforming JSON text. This follows what is often coined as the robustness principal, which states, "Be conservative in what you do; be liberal in what you accept from others." 

To convert JSON text directly into a .NET object, use the JsonConvert class import method, specifying the output type and JSON text. The following example shows conversion of a JSON array of strings into a .NET string array: 

 Copy   

 string jsonText = @"[""Europe"", ""Asia"", ""Australia"", ""Antarctica"", ""North America"", ""South America"", ""Africa""]"; string[] continents = (string[]) JsonConvert.Import(typeof(string[]), jsonText);     

Here is a more interesting example of conversion that takes an RSS XML feed, deserializes it into a .NET type using XmlSerializer, and then converts the object into JSON text using JsonConvert (effectively converting RSS in XML to JSON text): 

 Copy   

 XmlSerializer serializer = new XmlSerializer(typeof(RichSiteSummary)); RichSiteSummary news; // Get the MSDN RSS feed and deserialize it... using (XmlReader reader = XmlReader.Create("http://msdn.microsoft.com/rss.xml"))     news = (RichSiteSummary) serializer.Deserialize(reader); // Export the RichSiteSummary object as JSON text, emitting the output to // Console.Out. using (JsonTextWriter writer = new JsonTextWriter(Console.Out))     JsonConvert.Export(news, writer);     

 Note   The definition of RichSiteSummary and its related types can be found in the samples accompanying this article. Using JSON in ASP.NET 

Having looked at ways to work with JSON in JavaScript and from within the .NET Framework using Jayrock, it's time to turn to a practical example of where and how all this knowledge can be applied. Consider the client script callback feature in ASP.NET 2.0, which simplifies the process of making out-of-band calls from the web browser to the ASP.NET page (or to a particular control on the page). During a typical callback scenario, the client-side script in the browser packages and sends data back to the web server for some processing by a server-side method. After receiving the response data from the server, the client then uses it to update the browser display. 

 Note   More information can be found in the MSDN Magazine article Script Callbacks in ASP.NET 2.0. 

The challenge in a client callback scenario is that the client and server can only ship a string back and forth. Therefore, the information to be exchanged must be converted from a native, in-memory representation to a string before being sent and then parsed from a string back to its native, in-memory representation when received. The client script callback feature in ASP.NET 2.0 does not require a particular string format for the exchanged data, nor does it provide any built-in functionality for converting between the native in-memory and string representations; it is up to the developer to implement the conversion logic based on some data exchange format of his or her choice. 

The following example illustrates how to use JSON as the data exchange format in a client script callback scenario. In particular, the example consists of an ASP.NET page that uses data from the Northwind database to provide a listing of the categories in a drop-down list; products in the selected category are displayed in a bulleted list (see Figure 3). Whenever the drop-down list is changed on the client side, a callback is made passing in an array whose single element is the selected CategoryID. 

 Note   We are passing in an array that contains the selected CategoryID as its sole element (rather than just the CategoryID) because the JSON standard requires that any JSON text must have an object or an array as its root. Of course, the client is not required to pass JSON text to the server—we could have had this example pass just the selected CategoryID as a string. However, we wanted to demonstrate sending JSON text in both the request and response messages of the callback. 

The following code in the Page_Load event handler configures the Categories DropDownList Web control so that when it is changed, the GetProductsForCategory function is called and passed the selected drop-down lists value. This function initiates the client script callback if the passed-in drop-down list value is greater than zero: 

 Copy   

 // Add client-side onchange event to drop-down list Categories.Attributes["onchange"] = "Categories_onchange(this);"; // Generate the callback script string callbackScript = ClientScript.GetCallbackEventReference( /* control */ this, /* argument */ "'[' + categoryID + ']'", /* clientCallback */ "showProducts", /* context */ "null"); // Add the Categories_onchange function ClientScript.RegisterClientScriptBlock(GetType(), "Categories_onchange", @" function Categories_onchange(sender) { clearResults(); var categoryID = sender.value; if (categoryID > 0) { " + callbackScript + @" } }", true);     

The GetCallBackEventReference method in the ClientScriptManager class, which is used to generate the JavaScript code that invokes the callback, has the following signature: 

 Copy   

 public string GetCallbackEventReference ( Control control, string argument, string clientCallback, string context, )     

The argument parameter specifies what data is sent from the client to the web server during the callback and the clientCallback parameter specifies the name of the client-side function to invoke upon completion of the callback (showProducts). The GetCallBackEventReference method call generates the following JavaScript code and adds it to the rendered markup: 

 Copy   

 WebForm_DoCallback('__Page','[' + categoryID + ']',showProducts,null,null,false)     

'[' + categoryID + ']' is the value that is passed to the server during the callback (an array with a single element, categoryID) and showProducts is the JavaScript function that is executed when the callback returns. 

On the server side, the method that is executed in response to the callback uses JsonConvert class from Jayrock to parse the incoming JSON text and format the outgoing JSON text. In particular, the names of the products that are associated with the selected category are retrieved and returned as a string array. 

 Copy   

 // Deserialize the JSON text into an array of integers int[] args = (int[]) JsonConvert.Import(typeof(int[]), eventArgument); // Read the selected CategoryID from the array int categoryID = args[0]; // Get products based on categoryID NorthwindDataSet.ProductsRow[] rows = Northwind.Categories.FindByCategoryID(categoryID).GetProductsRows(); // Load the names into a string array string[] productNames = new string[rows.Length]; for (int i = 0; i < rows.Length; i++) { productNames[i] = rows[i].ProductName; } // Serialize the string array as JSON text and return it to the client return JsonConvert.ExportToString(productNames);     

 Note   The JsonConvert class is used twice—once to convert the JSON text in eventArgument into an array of integers and then to convert the string array productNames into JSON text to return to the client. Alternatively, we could have used the JsonReader and JsonWriter classes here, but JsonConvert does the same job fairly well when the data involved is relatively small and easily mapped to existing types. 

When the data is returned from the server-side, the JavaScript function specified from the GetCallBackEventReference method is called and passed the return value. This JavaScript method, showProducts, starts by referencing the <div> element ProductOutput. It then parses the JSON response and dynamically adds an unordered list with a list item for each array element. If no products are returned for the selected category, then a corresponding message is displayed instead. 

 Copy   

 function showProducts(arg, context) {     // Dump the JSON text response from the server.     document.forms[0].JSONResponse.value = arg;         // Parse JSON text returned from callback.     var categoryProducts = eval("(" + arg + ")");     // Get a reference to the <div> ProductOutput.         var output = document.getElementById("ProductOutput");     // If no products for category, show message.         if (categoryProducts.length == 0)     {         output.appendChild(document.createTextNode(             "There are no products for this category..."));     }     else     {         // There are products, display them in an unordered list.                 var ul = document.createElement("ul");                 for (var i = 0; i < categoryProducts.length; i++)         {             var product = categoryProducts[i];             var li = document.createElement("li");             li.appendChild(document.createTextNode(product));             ul.appendChild(li);         }                 output.appendChild(ul);     } }     

Figure 2 illustrates the sequence of events while Figure 3 shows this example in action; the complete code is included in this articles download. 


Figure 2: The client sends the selected CategoryID as the single element in an array and the server returns an array of associated product names. 


Figure 3: The products are displayed in a bulleted list inside the selected category. Conclusion 

JSON is a lightweight, text-based data exchange format based on a subset of the literal notation from the JavaScript programming language. It provides a succinct encoding for application data structures and is typically used in scenarios where a JavaScript implementation is available to one or both of the applications exchanging data, such as in Ajax-style web applications. The allure of JSON lies in its simplicity to understand, adopt, and implement. JSON has virtually no learning curve for developers already familiar with JavaScript or other programming languages with similar support for a rich literal notation (like Python and Ruby). Parsing JSON text in JavaScript code can be accomplished by simply calling the eval function, and creating JSON text is a breeze with the json.js script provided at http://www.json.org/json.js. 

There are a blossoming number of libraries for working with JSON across all major platforms and frameworks. In this article we looked at Jayrock, an open-source library for creating and parsing JSON text in .NET applications. Jayrock can be used in ASP.NET 1.x, 2.0, and Mono applications. ASP.NET AJAX offers similar JSON functionality, but for ASP.NET 2.0 applications only. 

Happy Programming! References  
ASP.NET AJAX 
Client-Side Web Service Calls with AJAX Extensions 
Core JavaScript 1.5 Guide 
eval(expr) function 
Jayrock 
JSON.org 
Script Callbacks in ASP.NET 
RFC 4627  

 Ajax or AJAX? 

 The term Ajax was initially coined by Jesse James Garrett to describe the style of web applications and set of technologies involved in making highly interactive web applications. Historically, the term Ajax spread around the web as the acronym AJAX, meaning Asynchronous JavaScript And XML. With time, however, people realized that the "X" in AJAX was not very representative of the underlying data format used to communicate with the web server in the background since most implementations were switching to JSON as a simpler and more efficient alternative. So rather than coming up with a replacement acronym like AJAJ that's a bit of tongue-twister, the acronym is generally being retired in favor of Ajax the term rather than AJAX the acronym. 

 At the time of this writing, expect to see a mixed and wide use of "AJAX" and "Ajax" to mean one and the same thing. In this article, we've stuck with "Ajax the term." Commercial products that provide frameworks enabling Ajax-style applications, however, tend to use the acronym form to distinguish from a similarly named cleaning agent product and to avoid any potential trademark or legal disputes. 

 ASP.NET AJAX: Inside JSON date and time string 

 The AJAX JSON serializer in ASP.NET encodes a DateTime instance as a JSON string. During its pre-release cycles, ASP.NET AJAX used the format "@ticks@", where ticks represents the number of milliseconds since January 1, 1970 in Universal Coordinated Time (UTC). A date and time in UTC like November 29, 1989, 4:55:30 AM would be written out as "@62831853071@." Although simple and straightforward, this format cannot differentiate between a serialized date and time value and a string that looks like a serialized date but is not meant to be deserialized as one. Consequently, the ASP.NET AJAX team made a change for the final release to address this problem by adopting the "\/Date(ticks)\/" format. 

 The new format relies on a small trick to reduce the chance for misinterpretation. In JSON, a forward-slash (/) character in a string can be escaped with a backslash (\) even though it is not strictly required. Taking advantage of this, the ASP.NET AJAX team modified JavaScriptSerializer to write a DateTime instance as the string "\/Date(ticks)\/" instead. The escaping of the two forward-slashes is superficial, but significant to JavaScriptSerializer. By JSON rules, 

"\/Date(ticks)\/

" is technically equivalent to 

"/Date(ticks)/

" but the JavaScriptSerializer will deserialize the former as a DateTime and the latter as a String. The chances for ambiguity are therefore considerably less when compared to the simpler "@ticks@" format from the pre-releases. Special Thanks 

Before submitting this article to MSDN, we had a number of volunteers help proofread the article and provide feedback on the content, grammar, and direction. Primary contributors to the review process include Douglas Crockford, Eric Schönholzer and Milan Negovan. About the Authors 

Atif Aziz is a Principal Consultant at Skybow AG where his primary focus is to help customers understand and build solutions on the .NET development platform. Atif contributes regularly to the Microsoft developer community by speaking at conferences and writing articles for technical publications. He is an INETA speaker and president of the largest Swiss .NET User Group. He can be reached at atif.aziz@skybow.com or via his web site at http://www.raboof.com. 

Scott Mitchell, author of six ASP/ASP.NET books and founder of 4GuysFromRolla.com, has been working with Microsoft web technologies since 1998. Scott works as an independent consultant, trainer, and writer. He can be reached at mitchell@4guysfromrolla.com or via his blog: http://ScottOnWriting.net.   

  Did you find this helpful? Yes No  

  Not accurate  

  Not enough depth  

  Need more code examples   

 Tell us more...  (1500 characters remaining)   



  Tools 
Visual Studio 
Expression 
ASP.NET 
Silverlight   Platforms 
Visual Studio 
Windows 
Windows Phone 
Windows Azure 
Office   Servers 
Windows Server 
Exchange Server 
SQL Server 
Biz Talk Server 
Data   Developer resources 
MSDN Subscriptions 
MSDN Magazine 
MSDN Flash Newsletter 
Code Samples 
MSDN Forums   Get started for free 
MSDN evaluation center 
BizSpark (for startups) 
DreamSpark (for students) 
School faculty     

© 2013 Microsoft. All rights reserved.  



 Newsletter  

| 

 Contact Us  

| 

 Privacy Statement  

| 

 Terms of Use  

| 

 Trademarks  



| 


     Site Feedback  

 Site Feedback  

 x   


 Tell us about your experience...  

 Did the page load quickly?  

  Yes   No   

 Do you like the page design?  

  Yes   No   

 Tell us more  

  Enter description here.         

© 2013 Microsoft. All rights reserved.     

        
        
See More >>


Topic: App_Data

App_Data was meant for data files like sql express DB's files. 
It is protected so that you can't surf to it and grab files out of it.

* Don't upload files into this folder.


Topic: Cookieless ASP

Show >>


Topic: CSS cursor style

Show >>


Topic: css

The id Selector

The id selector is used to specify a style for a single, unique element.

The id selector uses the id attribute of the HTML element, and is defined with a "#".

The style rule below will be applied to the element with id="para1":

Example
#para1
{
text-align:center;
color:red;
} 


The class Selector
The class selector is used to specify a style for a group of elements. Unlike the id selector, the class selector is most often used on several elements. 

This allows you to set a particular style for many HTML elements with the same class. 

The class selector uses the HTML class attribute, and is defined with a "."

In the example below, all HTML elements with class="center" will be center-aligned:

Example
.center {text-align:center;} 

You can also specify that only specific HTML elements should be affected by a class.

In the example below, all p elements with class="center" will be center-aligned:

Example
p.center {text-align:center;} 


Topic: Internationalization

Show >>


Topic: JSDoc_javascript_style_guide

                

 Annotations used in JSDoc 


  Tag  Description  

	 @author 	 Developers name  

	 @constructor 	 Marks a function as a constructor  

	 @deprecated 	 Marks a method as deprecated  

	 @exception 	 Synonym for @throws  

	 @param 	 Documents a method parameter; a datatype indicator can be added between curly braces  

	 @private 	 Signifies that a method is private  

	 @return 	 Documents a return value  

	 @see 	 Documents an association to another object  

	 @this 	 Specifies the type of the object to which the keyword this refers within a function.  

	 @throws 	 Documents an exception thrown by a method  

	 @version 	 Provides the version number of a library    

   JSDocタグリファレンス    

   タグ   書式と例   説明   

	 @author  	

 @author メールアドレス (名 姓)  例: 


 /** * @fileoverview テキストエリアを扱うためのユーティリティ群。 * @author kuth@google.com (Uthur Pendragon) */     	 ファイルの著者、またはテストの所有者を記載します。通常 @fileoverview  を含むコメントの中でのみ使用されます。  

	 @code  	

 {@code ...}  例: 


 /** * 選択されたものの中で次の位置に移動します。 * Throws {@code goog.iter.StopIteration} 最後尾を *超えた場合に発生する。 * @return {Node} 次の位置のノード。 */ goog.dom.RangeIterator.prototype.next = function() { // ... };    	 JSDocの説明文に含まれる語句がコードであることを示します。生成されたドキュメント内で適切に整形されることが想定されています。  

	 @const  	

 @const  例: 


 /** @const */ var MY_BEER = stout  

 /** * 名前空間が好きなビールの種類 * @const * @type {string} */ mynamespace.MY_BEER = stout  /** @const */ MyClass.MY_BEER = stout  

 /** * リクエストを初期化します。 * @const */ mynamespace.Request.prototype.initialize = function() { 

 // サブクラスはこのメソッドをオーバーライドできません。 }    	 変数(またはプロパティ)が読み取り専用であることを示します。このタグはインラインで記述するのに向いています。  @const が付けられた変数はある値への固定された参照と見なされ、 @const 付きの変数やプロパティが上書きされているとCompilerは警告を出力します。  データ型を明確に推測できるのであれば型の宣言は省いてもかまいません。その他のコメントの追加も必須ではありません。  メソッドに @const  が付けられている場合、そのメソッドに対しては単に上書きだけでなく、サブクラスによるオーバーライドも禁止されていることを意味します。  @const のより詳細な説明は「定数」を参照してください。  

	 @constructor  	

 @constructor  例: 


 /** * 長方形。 * @constructor */ function GM_Rect() { ... }    	 クラスの説明の中で使い、関数がコンストラクタであることを示します。  

	 @define  	

 @define {型名} 説明文  例: 


 /** @define {boolean} */ var TR_FLAGS_ENABLE_DEBUG = true; 

 /** @define {boolean} */ goog.userAgent.ASSUME_IE = false;    	 コンパイル時にCompilerによって上書きされる定数であることを示します。左の例でコンパイルフラグに --define=goog.userAgent.ASSUME_IE=true と指定すると、ビルド後のファイルでは goog.userAgent.ASSUME_IE の値はtrueに置き換えられます。  

	 @deprecated  	

 @deprecated 説明文  例: 


 /** * ノードがフィールドかどうかを判定します。 * @return {boolean} 要素の内容が *編集可能ならtrue。ただし要素そのものは *編集不可。 * @deprecated isField() を使ってください。 */ BN_EditUtil.isTopEditableField = function(node) { // ... };    	 関数、メソッド、プロパティをこれ以上使うべきでないことを伝えます。説明文の中でそれに替わるものを指示するのが普通です。  

	 @dict  	

 @dict 説明文  例: 


 /** * @constructor * @dict */ function Foo(x) { this[x] = x; } var obj = new Foo(123); var num = obj.x;// 警告  (/** @dict */ { x: 1 }).x = 123;// 警告    	 コンストラクタ(左の例の Foo )に @dict  が付けられた場合、 Foo  オブジェクトのプロパティへのアクセスは角括弧による表記法でのみ可能となります。アノテーションをオブジェクトリテラルに直接記述することもできます。  

	 @enum  	

 @enum {型名}  例: 


 /** * 3つの状態を値にもつ列挙型。 * @enum {number} */ project.TriState = { TRUE: 1, FALSE: -1, MAYBE: 0 };    	

	 @export  	

 @export  例: 


 /** @export */ foo.MyPublicClass.prototype.myPublicMethod = function() { 

 // ... };    	 --generate_exports  フラグを付けてコンパイルを実行すると、左のコードは次のように出力されます:  

 goog.exportSymbol(foo.MyPublicClass.prototype.myPublicMethod, foo.MyPublicClass.prototype.myPublicMethod);  コンパイル前のシンボルがエクスポートされているのが分かります。 @export  を使用するには以下の条件のどちらかを満たしていなければなりません。  1. //javascript/closure/base.js をインクルードしている 2. コードベース内に goog.exportSymbol と goog.exportProperty の両方が同じメソッドシグネチャで存在している。  

	 @expose  	

 @expose  例: 


 /** @export */ MyClass.prototype.exposedProperty = 3;    	 外部公開されているプロパティであることを宣言します。外部公開されたプロパティには削除、名前の変更、圧縮、Compilerによるいかなる最適化も実施されなくなります。同じ名前のプロパティを個別に最適化することはできません。  ライブラリのコードに対しては @expose  を使用すべきではありません。今まで正常に行われていたプロパティの削除を妨げることになるからです。  

	 @extends  	

 @extends 型名 @extends {型名}  例: 


 /** * 常に空のノードリスト * @constructor * @extends goog.ds.BasicNodeList */ goog.ds.EmptyNodeList = function() { ... };    	 @constructor  と共に使用し、あるクラスが別のクラスを継承していることを示します。型を囲む波括弧は省略可能です。  

	 @externs  	

 @externs  例: 


 /** * @fileoverview これはexternファイルです。 * @externs */  var document;    	 externファイルであることを宣言します。  

	 @fileoverview  	

 @fileoverview 説明文  例: 


 /** * @fileoverview 何かをするユーティリティ群。その説明には * このように長くてインデントされていないコメントを必要とします。 * @author kuth@google.com (Uthur Pendragon) */     	 ファイルレベルの情報を提供するコメントブロックを構成します。  

	 @implements  	

 @implements 型名 @implements {型名}  例: 


 /** * 形状。 * @interface */ function Shape() {}; Shape.prototype.draw = function() {};  

 /** * @constructor * @implements {Shape} */ function Square() {}; Square.prototype.draw = function() { ... };    	 @constructor  と共に使用し、あるクラスがインタフェースを実装していることを示します。型を囲む波括弧は省略可能です。  

	 @inheritDoc  	

 @inheritDoc  例: 


 /** * @inheritDoc */ project.SubClass.prototype.toString() { // ... };    	 非推奨。 @override を使ってください。    サブクラスのメソッド・プロパティが、スーパークラスのメソッド・プロパティを意図的に隠蔽しており、全く同じJSDocコメントを持つことを示します。 @inheritDoc は @override  を包含する点に注意してください。  

	 @interface  	

 @interface  例: 


 /** * 形状。 * @interface */ function Shape() {}; Shape.prototype.draw = function() {};  

 /** * 多角形。 * @interface * @extends {Shape} */ function Polygon() {}; Polygon.prototype.getSides = function() {};    	 その関数がインタフェースであることを示すために使います。  

	 @lends  	

 @lends オブジェクト名 @lends {オブジェクト名}  例: 

 goog.object.extend( Button.prototype, /** @lends {Button.prototype} */ { isButton: function() { return true; } });    	 オブジェクトリテラルのキーが他のオブジェクトのプロパティとして扱われるべきであることを示します。このアノテーションはオブジェクトリテラルにだけ付けられます。  他のアノテーションとは異なり、波括弧の中の名前はクラス名ではなくオブジェクト名である点に注意してください。それはプロパティが lent(貸与)されているオブジェクトの名前です。例えば @type {Foo} は Fooのインスタンス を意味しますが、 @lends {Foo} は Fooのコンストラクタ関数 のことです。  このアノテーションについてのより詳しい説明はJSDoc Toolkit のドキュメント(日本語)を参照してください。  

	 @license or @preserve  	

 @license 説明文  例: 


 /** * @preserve Copyright 2009 SomeThirdParty. * このファイルに関する完全なライセンス条項と * 著作権表示を記載します。文章は複数行にわたっても構いませんが、 * 必ず末尾は */ で閉じられている必要があります。 */     	 @license または @preserve  が付けられたコメントはCompilerの処理から保護され、コンパイルされたコードよりも前に出力されます。コンパイルの影響を受けないことから、このアノテーションは重要な通知(ライセンスや著作権のような)を行うのに向いています。改行もそのまま残されます。  

	 @noalias  	

 @noalias  例: 


 /** @noalias */ function Range() {}    	 Externファイルの中で使い、この変数または関数に別名を付けてはならないことをCompilerに示します。  

	 @nosideeffects  	

 @nosideeffects  例: 


 /** @nosideeffects */ function noSideEffectsFn1() { 

 // ... }; 

 /** @nosideeffects */ var noSideEffectsFn2 = function() { 

 // ... }; 

 /** @nosideeffects */ a.prototype.noSideEffectsFn3 = function() { 

 // ... };    	 関数やコンストラクタに付けられ、それらの呼び出しが他のコードに影響を及ぼさないことを示します。このアノテーションはCompilerに対し、戻り値が使用されていない場合にそれらの関数を削除することを許可します。  

	 @override  	

 @override  例: 


 /** * @return {string} project.SubClassの人間が理解できる表現。 * @override */ project.SubClass.prototype.toString() { // ... };    	 サブクラスのメソッド・プロパティが、スーパークラスのメソッド・プロパティを意図的に隠蔽していることを示します。コメントにこれ以外の記述が含まれない場合、スーパークラスで書かれた内容がサブクラスに引き継がれます。  

	 @param  	

 @param {型} 変数名 説明文  例: 


 /** * 各項目のBazを問い合わせます。 * @param {number} groupNum 問い合わせのためのサブグループID。 * @param {string|number|null} term 項目名、 *または項目ID、もしnullの場合は全て検索します。 */ goog.Baz.prototype.query = function(groupNum, term) { // ... };    	 メソッド、関数、コンストラクタに対し、それらの引数を説明するために使用します。  型名は必ず波括弧で括られていなければなりません。型名が省略された場合、Compilerは型チェックを行いません。  

	 @private  	

 @private  例: 


 /** * このロガーを監視しているハンドラの配列。 * @type Array.<Function> * @private */ this.handlers_ = [];    	 メソッド・プロパティ名の末尾にアンダースコアを付加する仕様と組み合わせて、メンバがprivateであることを示します。  

	 @protected  	

 @protected  例: 


 /** * 指定されたDOM要素をコンポーネントのルート要素として設定します。 * このメソッドのスコープはprotectedで、オーバーライドできません。 * @param {Element} element コンポーネントのルート要素 * @protected */ goog.ui.Component.prototype.setElementInternal = function(element) { // ... };    	 メソッド・プロパティがprotectedであることを示します。名前の末尾にアンダースコアを付けてはいけません。  

	 @return  	

 @return {型} 説明文  例: 


 /** * @return {string} 最後の項目の16進数表記のID */ goog.Baz.prototype.getLastId = function() { // ... return id; };    	 メソッドと関数に対し、それらの戻り値を説明するために使用します。論理型の戻り値の説明では、コンポーネントが見えるならtrue、そうでなければfalse よりも コンポーネントが見えるかどうか の方が良い書き方です。戻り値が無い場合、 @return  タグは使わないで下さい。  型名は必ず波括弧で括られていなければなりません。型名が省略された場合、Compilerは型チェックを行いません。  

	 @see  	

 @see リンク  例: 


 /** * むやみに項目を追加します。 * @see #addSafely * @see goog.Collect * @see goog.RecklessAdder#add ...    	 他のクラス、関数、メソッドへの参照を記載します。  

	 @struct  	

 @struct 説明文  例: 


 /** * @constructor * @struct */ function Foo(x) { this.x = x; } var obj = new Foo(123); var num = obj[x// 警告 obj.y = asdf// 警告  Foo.prototype = /** @struct */ { method1: function() {} }; Foo.prototype.method2 = function() {};// 警告    	 コンストラクタ(左の例の Foo )に @struct が付けられた場合、 Foo  オブジェクトのプロパティへのアクセスはドットによる表記法でのみ可能となります。また、生成された Foo  オブジェクトへ新しいプロパティを追加することはできません。アノテーションをオブジェクトリテラルに直接記述することもできます。  

	 @supported  	

 @supported 説明文  例: 


 /** * @fileoverview イベントマネージャ * ブラウザ固有のイベントシステムを抽象化した * インタフェースを提供します。 * @supported これまで IE6 と FF1.5 でテスト済みです。 */     	 @fileoverview  を含むコメントブロックで使用し、このファイルの内容をサポートするブラウザを記載します。  

	 @suppress  	

 @suppress {警告1|警告2}  例: 


 /** * @suppress {deprecated} */ function f() { deprecatedVersionOfF(); }    	 ツールからの警告を抑止します。警告の種類が複数ある場合は | で区切ります。  

	 @template  	

 @template  例: 


 /** * @param {function(this:T, ...)} fn * @param {T} thisObj * @param {...*} var_args * @template T */ goog.bind = function(fn, thisObj, var_args) { ... };    	 このアノテーションはテンプレート型を宣言するために使用します。  

	 @this  	

 @this 型名 @this {型名}  例: 

 pinto.chat.RosterWidget.extern(getRosterElement, 

 /** * 名簿ウィジェットの要素を返します。 * @this pinto.chat.RosterWidget * @return {Element} */ function() { return this.getWrappedComponent_().getElement(); });    	 特定のメソッドが呼ばれるときのコンテキストの型を表します。thisがプロトタイプメソッドでない関数から参照されているときに必要です。  

	 @type  	

 @type 型名 @type {型名}  例: 


 /** * 16進数形式のID。 * @type {string} */ var hexId = hexId;    	 変数、プロパティ、式のデータ型を表します。ほとんどの型において波括弧で囲むことは必須ではありませんが、一貫性のためにそれを強制しているプロジェクトもあります。  

	 @typedef  	

 @typedef  例: 


 /** @typedef {(string|number)} */ goog.NumberLike;  

 /** @param {goog.NumberLike} x 数値か文字列 */ goog.readNumber = function(x) { ... }    	 このアノテーションは複雑な型に別名を付けるために使用します。    

   JavaScriptで使えるデータ型    

   型名の例   値の例   説明   

	 number  	 1 1.0 -5 1e5 Math.PI    	

	 Number  	 new Number(true)    	 Numberオブジェクト  

	 string  	 Hello World String(42)    	 文字列値  

	 String  	 new String(Hello) new String(42)    	 Stringオブジェクト  

	 boolean  	 true false Boolean(0)    	 論理値  

	 Boolean  	 new Boolean(true)    	 Booleanオブジェクト  

	 RegExp  	 new RegExp(hello) /world/g    	

	 Date  	 new Date new Date()    	

	 null  	 null    	

	 undefined  	 undefined    	

	 void  	 function f() { return; }    	 戻り値なし  

	 Array  	 [foo, 0.3, null] []    	 型指定のない配列  

	 Array.<number>  	 [11, 22, 33]    	 数値の配列  

	 Array.<Array.<string>>  	 [[one, two, three], [foo, bar]]    	 文字列の配列の配列  

	 Object  	 {} {foo: abc, bar: 123, baz: null}    	

	 Object.<string>  	 {foo: bar}    	 文字列の値を持つオブジェクト  

	 Object.<number, string>  	 var obj = {}; obj[1] = bar    	 キーが数値で値が文字列のオブジェクト。  JavaScriptではオブジェクトのキーは暗黙的に文字列に変換される点に注意してください。従って obj[1] == obj[1] です。また for...in ループの中でもキーは常に文字列です。しかしCompilerはキーがオブジェクトの中でインデックスとして機能しているかどうかを識別します。  

	 Function  	 function(x, y) { return x * y; }    	 Functionオブジェクト  

	 function(number, number): number  	 function(x, y) { return x * y; }    	 関数  

	 SomeClass   	 /** @constructor */ function SomeClass() {}  new SomeClass();    	

	 SomeInterface  	 /** @interface */ function SomeInterface() {}  SomeInterface.prototype.draw = function() {};    	

	 project.MyClass  	 /** @constructor */ project.MyClass = function () {}  new project.MyClass()    	

	 project.MyEnum   	 /** @enum {string} */ project.MyEnum = { /** 青色 */ BLUE: #0000dd, /** 赤色 */ RED: #dd0000    	 列挙型  列挙値に対するJSDocコメントは省略可能です。  

	 Element  	 document.createElement(div)    	 DOM要素  

	 Node  	 document.body.firstChild    	 DOMノード  

	 HTMLInputElement  	 htmlDocument.getElementsByTagName(input)[0]    	 DOM要素の型を明示的に指定します。       
        
See More >>


Topic: MIME

                 Internet media types 

 Internet media types, formerly known as MIME types or Content-types, is a standard designed to indicate the type of information a file or piece of data contains. In  HTML, this identifier can be useful for knowing the type of a file before downloading and being able to access it. Its a good practice to provide media type information whenever possible, like in the case of elements having attributes like type, enctype, formenctype and accept. 

 Every Internet media types identifier must comply with the following format: 

 [type]/[tree.](Optional)[subtype][+suffix](Optional)[;parameters](Optional) 

 As you may have already noted, the type and subtype must be present in any Internet media type. In the following list are some examples containing each of the parts outlined before.  
imagetype/pngsubtype  
applicationtype/rsssubtype+xmlsuffix  
videotype/mp4subtype; codecs=avc1.640028parameters  
applicationtype/vnd.google-earthtree.kmzsubtype     Commont internet media types 

 Currently, there are nine top-level types, which are: application, audio, example, image, message, model, multipart, text and video. The following section provides some of the most popular media types used in web applications.   Type application  
application/atom+xml: Atom feeds format.  
application/vnd.dart: Dart file format.  
application/ecmascript:  ECMAScript/JavaScript data (equivalent to application/javascript but with stricter processing rules).  
application/EDI-X12:  EDI X12 data.  
application/EDIFACT:  EDI  EDIFACT data.  
application/json:  JSON data.  
application/javascript:  ECMAScript/JavaScript data (equivalent to application/ecmascript but with looser processing rules).  
application/octet-stream: Arbitrary binary data.  
application/ogg: Ogg, a multimedia bitstream container format.  
application/dash+xml:  MPEG-DASH, a multimedia streaming standard.  
application/pdf:  PDF, a document exchange format.  
application/postscript: PostScript format.  
application/rdf+xml:  RDF format.  
application/rss+xml:  RSS feeds format.  
application/soap+xml:  SOAP format.  
application/font-woff:  WOFF (candidate recommendation; use application/x-font-woff until standard is official).  
application/xhtml+xml:  XHTML format.  
application/xml:  XML format.  
application/xml-dtd:  DTD format.  
application/xop+xml:  XOP data.  
application/zip: ZIP compressed format.  
application/gzip: Gzip compressed format.  
application/smil+xml:  SMIL format.  
application/vnd.android.package-archive:  APK files.  
application/vnd.debian.binary-package: DEB file format.  
application/vnd.google-earth.kml+xml:  KML files.  
application/vnd.google-earth.kmz:  KMZ files.  
application/vnd.mozilla.xul+xml:  XUL files.  
application/vnd.ms-excel: Microsoft Excel files.  
application/vnd.ms-powerpoint: Microsoft Powerpoint files.  
application/vnd.ms-xpsdocument:  XPS.  
application/vnd.oasis.opendocument.text: OpenDocument Text.  
application/vnd.oasis.opendocument.spreadsheet: OpenDocument Spreadsheet.  
application/vnd.oasis.opendocument.presentation: OpenDocument Presentation.  
application/vnd.oasis.opendocument.graphics: OpenDocument Graphics.  
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet: Microsoft Excel 2007.  
application/vnd.openxmlformats-officedocument.presentationml.presentation: Microsoft Powerpoint 2007.  
application/vnd.openxmlformats-officedocument.wordprocessingml.document: Microsoft Word 2007.  
application/x-7z-compressed: 7-Zip compression format.  
application/x-chrome-extension: Google Chrome/Chrome OS extension, app or theme package.  
application/x-dvi: Device-independent document in  DVI format.  
application/x-font-ttf:  TTF format TrueType Font (unofficial but widely used).  
application/x-javascript.  
application/x-latex:  LaTeX format.  
application/x-mpegURL: .m3u8 variant playlist.  
application/x-rar-compressed:  RAR format.  
application/x-shockwave-flash: Adobe Flash format.  
application/x-stuffit: StuffIt archive format.  
application/x-tar: Tarball format.  
application/x-www-form-urlencoded: form encoded data.  
application/x-xpinstall: Add-ons to Mozilla applications.  
application/x-nacl: Native Client web module (supplied via Google Web Store only).  
application/x-pnacl: Portable Native Client web module (may be supplied by any website as it is safer than x-nacl)  
application/x-pkcs12: a variant of  PKCS files.     Type audio  
audio/basic: μ-law format, at 8 kHz, 1 channel.  
audio/L24: 24bit Linear  PCM format, at 8–48 kHz, 1-N channels.  
audio/mp4:  MP4 format.  
audio/mpeg:  MP3 or other  MPEG format.  
audio/ogg: Vorbis, Opus, Speex,  FLAC and other formats in an Ogg container.  
audio/flac: native  FLAC format (FLAC in its own container).  
audio/opus: Opus streamed format.  
audio/vorbis: Vorbis streamed format.  
audio/vnd.rn-realaudio: RealAudio format.  
audio/vnd.wave:  WAV format.  
audio/webm: WebM open media format.  
audio/x-aac:  AAC format.  
audio/x-caf: Apples  CAF audio files.     Type image  
image/gif:  GIF format.  
image/jpeg:  JPEG  JFIF format.  
image/pjpeg:  JPEG  JFIF format (for progressive  JPEG, used before global browser support).  
image/png:  PNG format.  
image/bmp:  BMP format.  
image/svg+xml:  SVG vector format.  
image/tiff:  TIFF image;  
image/vnd.djvu: DjVu image and multipage document format.  
image/x-xcf:  XCF,  GIMPs file format.     Type message  
message/http:  HTTP message.  
message/imdn+xml:  IMDN message.  
message/partial: e-mail message.  
message/rfc822: e-mail message (EML files,  MIME,  MHT,  MHTML).     Type model  
model/iges: IGS and  IGES files.  
model/mesh: MSH and MESH files.  
model/vrml:  WRL and  VRML files.  
model/x3d+binary: X3D  ISO standard for representing 3D computer graphics, X3DB binary files (not official but still used).  
model/x3d+fastinfoset: X3D  ISO standard for representing 3D computer graphics, X3DB binary files (not yet official, replaces any use of model/x3d+binary).  
model/x3d-vrml: X3D  ISO standard for representing 3D computer graphics, X3DV  VRML files (not yet official, previously known as model/x3d+vrml)  
model/x3d+xml: X3D  ISO standard for representing 3D computer graphics, X3D  XML files.     Type multipart  
multipart/mixed:  MIME email.  
multipart/alternative:  MIME email.  
multipart/related:  MIME email (used by  MHTML).  
multipart/form-data:  MIME webform.  
multipart/signed:  MIME security.  
multipart/encrypted:  MIME security.     Type text  
text/cmd: commands.  
text/css:  CSS.  
text/csv:  CSV.  
text/html:  HTML.  
text/markdown: Markdown.  
text/javascript: JavaScript (made obsolete in favor of application/javascript, but better supported).  
text/plain: Textual data.  
text/rtf:  RTF.  
text/vcard: vCard (contact information).  
text/vnd.a: The A language framework.  
text/vnd.abc: ABC music notation.  
text/xml:  XML.  
text/x-gwt-rpc: GoogleWebToolkit data.  
text/x-jquery-tmpl: jQuery template data.     Type video  
video/avi: Covers most Windows-compatible formats including .avi and .divx.  
video/mpeg:  MPEG-1 video with multiplexed audio.  
video/mp4:  MP4 video.  
video/ogg: Ogg Theora or other video (with audio).  
video/quicktime: QuickTime video.  
video/webm: WebM Matroska-based open media format.  
video/x-matroska: Matroska open media format.  
video/x-ms-wmv:  WMV format.  
video/x-flv:  FLV format.       
        
See More >>


Topic: ReCAPTCHA

                 Adding reCAPTCHA to your site 


 Keys 

 Site key 

 Use this in the HTML code your site serves to users. 

 6LfSEQcTAAAAAMm_lw570StErfFcPplltcfl8anB  


 Secret key 

 Use this for communication between your site and Google. Be sure to keep it a secret. 

 6LfSEQcTAAAAAOKPgkahZt3NGVwc1prYfh8HLh5G    

 Step 1: client-side integration 

 Paste this snippet before the closing </head> tag on your HTML template:<script src=https://www.google.com/recaptcha/api.js></script>Paste this snippet at the end of the <form> where you want the reCAPTCHA widget to appear:<div class=g-recaptcha data-sitekey=6LfSEQcTAAAAAMm_lw570StErfFcPplltcfl8anB></div>The reCAPTCHA documentation sitedescribes more details and advanced configurations.  


 Step 2: Server side integration 

 When your users submit the form where you integrated reCAPTCHA, youll get as part of the payload a string with the name g-recaptcha-response. In order to check whether Google has verified that user, send a POST request with these parameters:

  URL: https://www.google.com/recaptcha/api/siteverify   

	  secret(required) 	 6LfSEQcTAAAAAOKPgkahZt3NGVwc1prYfh8HLh5G  

	  response(required) 	 The value of g-recaptcha-response.  

	  remoteip 	 The end users ip address.        
        
See More >>


Topic: WebDeployManual_2012

Show >>


Topic: WebGrid - Get the Most out of WebGrid in ASP_NET MVC

                  

       
        
See More >>